The toddler’s introduction to Heap exploitation (Part 1)

In my introductory post I talked about dynamic memory allocation and I referenced various implementations that are used to tackle this problem. In this post I am going to focus on the GNU C library’s memory allocation which is based on ptmalloc2 (pthreads malloc) and it is derived from dlmalloc (Doug Lea’s malloc).

The Heap organisation char

The heap, assigned to each program, is organised according to three major components:

  • The arena, which represents a memory area assigned to each thread and contains references to one or more heaps. The main thread’s arena is called “main arena” and doesn’t have multiple heaps.
  • The heap, which is a contiguous region of memory that is subdivided into chunks and belongs to exactly one arena.
  • The chunk which is a small range of memory that can be allocated, freed, or combined with adjacent chunks into larger ranges. Each chunk exists in one heap and belongs to one arena.

Recall the parking lot from my previous post ? if you liken the heap as parking building with multiple levels and slots, then you have the following structure:

Arena, Heaps and Chunks in a Parking lot context

As P-Acme’s (the parking company) growth keeps increasing, new buildings are built and new employees are hired to manage them. Similarly when a new thread is created the system assigns a new arena to it. There is a limit to the number of arenas, so when this limit is reached the program has to reuse the existing ones. The program below creates two threads, while each one of them will get a separate arena:

ptmalloc overview

Now that we have an idea about the heap structure lets get a full overview of the ptmalloc allocator.

To allocate memory for the “main arena”, malloc invokes the sbrk function and despite the requested size, the system will assign 132 KB of memory. Further malloc invocations in the main thread will keep using the same arena while in case the memory is exhausted, the heap size will be increased. For every new thread a different arena (thread arena) is created by calling the mmap function and once again 132 KB are allocated. The maximum number of arenas is eight times the number of CPUs in the system (unless the user specifies otherwise).

Glibc uses a C struct called malloc_state to maintain information about each arena. For the main arena this information is stored in a global variable while for the rest it is stored in the thread arenas themselves:

Next comes the heap_info C struct, with heap metadata information:

Note that, every heap has its own header except the one that belongs to the main arena. Finally, we have the malloc_chunk C struct which, as you probably guessed, contains information about a chunk. A high level overview about the concepts that we discussed so far can be found below:

Figure source:

As Glibc’s malloc is chunk-oriented, we will focus on the malloc_chunk (the green block above) and see how it is structured.

The chunk

If you examine the malloc_chunk C struct, you will notice that the fd,bk, fb_nextsize and bk_nextsize properties are used only when the chunk is free (see code snippet bellow) while the mchunk_prev_size is used only in case the previous chunk is free:

The mchunk_size indicates the size of the current chunk and since it is a multiple of 8 bytes, the last 3 bits (|A|M|P| to give the some names) are used as flags to indicate the following states:

  • The A flag is used to inform the allocator if the chunk comes from the main arena (A == 1) or not (A == 0).
  • The M flag is set to 1, then this chunk was allocated with a single call to mmap and is not part of a heap at all.
  • The P flag indicates if the previous chunk is used(P==1) or free (P==0) so the current chunk can safely unite with the previous:
chunk_n states at time t and time t+1 according to previous chunk’s allocation

Note that, since chunks are adjacent to each other in memory, if you know the address of the first chunk (lowest address) in a heap, you can iterate through all the chunks in the heap by using the size information.

Assigning parking slots

Lets for now forget about multiple arenas and focus on the main one. In order to understand how the chunk assignment works we are going to use the parking model and whats how Alice handles a parking request:

  • Alice holds a lists of all the free parking slots (the chunks) written down in notebooks called bins. Remember that a car can occupy more than one slots, so when a car is exiting the parking Alice adds the freed slots to her list.
  • When a car enters Alice calculates the size of it and searches in her list for a suitable place. If she finds one, she removes it from her list and creates a parking ticket that gives it to the car driver. This ticket contains the index, the size of the parking slot as well as mark that indicates if the previous (adjacent) slot is in use. This will help her to unify the parking areas on the exit of the car.
  • If there are no available slots she will try to create new ones (somewhere at the end of parking place — the top chunk). If no place is available, she will ask the company to extend the parking. If the company approves, Alice will send the car to the new location otherwise she will decline the entrance (malloc returns null).

That’s all for now, stay in touch for the next parts !











Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store