What is the Heap?

Lecture

Recap: Types of Memory

Memory comes in different types:

  • ELF .text: where the code lives

  • ELF .plt: where library function stubs live

  • ELF .got: where pointers to imported symbols live

  • ELF .bss: used for uninitialized global writable data (such as global arrays without initial values)

  • ELF .data: used for pre-initialized global writable data (such as global arrays with initial values)

  • ELF .rodata: used for global read-only data (such as string constants)

  • Stack: local variables, temporary storage, call stack metadata

But what if you needed a place to store long-lived dynamic memory, for example, a variable-length list of NPCs in a game? What if you needed dynamic memory allocation?

One Idea: mmap()

What if we mmap()ed memory as we need it? For example:

mmap(0, num_pages*0x100, ...)

Pros:

  • Allows dynamic allocation/deallocation according to changing program needs.

  • Allocated memory survives across functions.

Cons:

  • Inflexible allocation size (must be multiples of 4096 bytes).

  • Crazy slow (requires kernel involvement).

Smarter Solution

What if we wrote a library that mmap()ed a bunch of memory and handed out small chunks of it on demand!

The library could be used like:

char *firstname = allocate_memory(128);
char *lastname = allocate_memory(256);
scanf("%s %s", firstname, lastname);
printf("Hello %s %s!", firstname, lastname);
free_memory(firstname);
free_memory(lastname);

Pictorially:

--------------------------------------------------
| firstname | lastname | mmap()ed but unassigned |
--------------------------------------------------

This idea is called a dynamic allocator in the real world.

Dynamic Allocators Exist!

We're not the first to have this idea:

  • General Purpose:

    • Doug Lea (pictured) releases dlmalloc into public domain in 1987.

  • Linux:

    • ptmalloc (Posix Thread aware fork of dlmalloc)

  • FreeBSD:

    • jemalloc (also used in Firefox, Android)

  • Windows:

    • Segment Heap

    • NT Heap

  • Kernel allocators:

    • kmalloc (Linux kernel memory allocator)

    • kalloc (iOS kernel memory allocator)

In CTF, we are interested in ptmalloc since we are dealing with Linux binaries most of the times.

The memory space managed by a dynamic allocator is colloquially known as "The Heap". It has nothing to do with the heap data structure.

What Does the Heap Do?

The heap, as implemented by ptmalloc/glibc (and analogues), provides:

  • malloc(): allocate some memory (chunk)

  • free(): free a prior allocated chunk

And some auxiliary functions:

  • realloc(): change the size of an allocation

  • calloc(): allocate and zero-out memory

These functions are used, extensively, by practically every single non-trivial piece of software.

How Does the Heap Work?

ptmalloc actually does not use mmap()!

The Data Segment:

  • Historic oddity from segmented memory spaces of yore with ASLR, placed randomly into memory near-ish the PIE base

  • Starts out with a size of 0

  • Managed by the brk and sbrk system calls:

    • sbrk(NULL) returns the end of the data segment

    • sbrk(delta) expands the end of the data segment by delta bytes

    • brk(addr) expands the end of the data segment to addr

Under the hood, this is managed just like mmap(). ptmalloc slices off bits of the data segment for small allocations, and uses mmap() for large allocations.

Dangers of the Heap

What can go wrong?

The heap is:

  1. Used by imperfect human programmers

    • Humans forget to free memory

    • Humans forget all the spots where they store pointers to data

    • Humans forget what they've freed

  2. A library that strives for performance

    • Allocation and deallocation needs to be fast, or programs will slow down

    • Optimizations often leave security as an afterthought

Bugs caused by #1 become security issues due to #2 if not caught!

Here Lies Danger

How to detect issues?

  • Valgrind can detect heap misuse (if your testcases trigger it)

  • glibc itself has some hardening techniques:

    • export MALLOC_CHECK_=1

    • export MALLOC_PERTURB_=1

    • export MALLOC_MMAP_THRESHOLD_=1

  • There are various "more secure" allocators being developed (but not really deployed)

Like many other issues, no general techniques exist for detecting dynamic allocation errors.

Last updated