Can anyone explain how <code>malloc()</code> works internally? I have sometimes done <code>strace program</code> and I see a lot of <code>sbrk</code> system calls, doing <code>man sbrk</code> talks about it being used in <code>malloc()</code> but not much more.

The <code>sbrk</code>system call moves the "border" of the data segment. This means it moves a border of an area in which a program may read/write data (letting it grow or shrink, although AFAIK no <code>malloc</code> really gives memory segments back to the kernel with that method). Aside from that, there's also <code>mmap</code> which is used to map files into memory but is also used to allocate memory (if you need to allocate shared memory, <code>mmap</code> is how you do it). So you have two methods of getting more memory from the kernel: <code>sbrk</code> and <code>mmap</code>. There are various strategies on how to organize the memory that you've got from the kernel. One naive way is to partition it into zones, often called "buckets", which are dedicated to certain structure sizes. For example, a <code>malloc</code> implementation could create buckets for 16, 64, 256 and 1024 byte structures. If you ask <code>malloc</code> to give you memory of a given size it rounds that number up to the next bucket size and then gives you an element from that bucket. If you need a bigger area <code>malloc</code> could use <code>mmap</code> to allocate directly with the kernel. If the bucket of a certain size is empty <code>malloc</code> could use <code>sbrk</code> to get more space for a new bucket. There are various <code>malloc</code> designs and there is propably no one true way of implementing <code>malloc</code> as you need to make a compromise between speed, overhead and avoiding fragmentation/space effectiveness. For example, if a bucket runs out of elements an implementation might get an element from a bigger bucket, split it up and add it to the bucket that ran out of elements. This would be quite space efficient but would not be possible with every design. If you just get another bucket via <code>sbrk</code>/<code>mmap</code> that might be faster and even easier, but not as space efficient. Also, the design must of course take into account that "free" needs to make space available to <code>malloc</code> again somehow. You don't just hand out memory without reusing it. If you're interested, the OpenSER/Kamailio SIP proxy has two <code>malloc</code> implementations (they need their own because they make heavy use of shared memory and the system <code>malloc</code> doesn't support shared memory). See: https://github.com/OpenSIPS/opensips/tree/master/mem Then you could also have a look at the GNU libc <code>malloc</code> implementation, but that one is very complicated, IIRC.

Simplistically <code>malloc</code> and <code>free</code> work like this: <code>malloc</code> provides access to a process's heap. The heap is a construct in the C core library (commonly libc) that allows objects to obtain exclusive access to some space on the process's heap. Each allocation on the heap is called a heap cell. This typically consists of a header that hold information on the size of the cell as well as a pointer to the next heap cell. This makes a heap effectively a linked list. When one starts a process, the heap contains a single cell that contains all the heap space assigned on startup. This cell exists on the heap's free list. When one calls <code>malloc</code>, memory is taken from the large heap cell, which is returned by <code>malloc</code>. The rest is formed into a new heap cell that consists of all the rest of the memory. When one frees memory, the heap cell is added to the end of the heap's free list. Subsequent <code>malloc</code>'s walk the free list looking for a cell of suitable size. As can be expected the heap can get fragmented and the heap manager may from time to time, try to merge adjacent heap cells. When there is no memory left on the free list for a desired allocation, <code>malloc</code> calls <code>brk</code> or <code>sbrk</code> which are the system calls requesting more memory pages from the operating system. Now there are a few modification to optimize heap operations. <ul> <li>For large memory allocations (typically > 512 bytes, the heap manager may go straight to the OS and allocate a full memory page.</li> <li>The heap may specify a minimum size of allocation to prevent large amounts of fragmentation.</li> <li>The heap may also divide itself into bins one for small allocations and one for larger allocations to make larger allocations quicker.</li> <li>There are also clever mechanisms for optimizing multi-threaded heap allocation.</li> </ul>

How is malloc() implemented internally? [duplicate]

2 Answers

The sbrksystem call moves the "border" of the data segment. This means it moves a border of an area in which a program may read/write data (letting it grow or shrink, although AFAIK no malloc really gives memory segments back to the kernel with that method). Aside from that, there's also mmap which is used to map files into memory but is also used to allocate memory (if you need to allocate shared memory, mmap is how you do it).

So you have two methods of getting more memory from the kernel: sbrk and mmap. There are various strategies on how to organize the memory that you've got from the kernel.

One naive way is to partition it into zones, often called "buckets", which are dedicated to certain structure sizes. For example, a malloc implementation could create buckets for 16, 64, 256 and 1024 byte structures. If you ask malloc to give you memory of a given size it rounds that number up to the next bucket size and then gives you an element from that bucket. If you need a bigger area malloc could use mmap to allocate directly with the kernel. If the bucket of a certain size is empty malloc could use sbrk to get more space for a new bucket.

There are various malloc designs and there is propably no one true way of implementing malloc as you need to make a compromise between speed, overhead and avoiding fragmentation/space effectiveness. For example, if a bucket runs out of elements an implementation might get an element from a bigger bucket, split it up and add it to the bucket that ran out of elements. This would be quite space efficient but would not be possible with every design. If you just get another bucket via sbrk/mmap that might be faster and even easier, but not as space efficient. Also, the design must of course take into account that "free" needs to make space available to malloc again somehow. You don't just hand out memory without reusing it.

If you're interested, the OpenSER/Kamailio SIP proxy has two malloc implementations (they need their own because they make heavy use of shared memory and the system malloc doesn't support shared memory). See: https://github.com/OpenSIPS/opensips/tree/master/mem

Then you could also have a look at the GNU libc malloc implementation, but that one is very complicated, IIRC.

110

answered Oct 18 '22 08:10

DarkDust

Simplistically malloc and free work like this:

malloc provides access to a process's heap. The heap is a construct in the C core library (commonly libc) that allows objects to obtain exclusive access to some space on the process's heap.

Each allocation on the heap is called a heap cell. This typically consists of a header that hold information on the size of the cell as well as a pointer to the next heap cell. This makes a heap effectively a linked list.

When one starts a process, the heap contains a single cell that contains all the heap space assigned on startup. This cell exists on the heap's free list.

When one calls malloc, memory is taken from the large heap cell, which is returned by malloc. The rest is formed into a new heap cell that consists of all the rest of the memory.

When one frees memory, the heap cell is added to the end of the heap's free list. Subsequent malloc's walk the free list looking for a cell of suitable size.

As can be expected the heap can get fragmented and the heap manager may from time to time, try to merge adjacent heap cells.

When there is no memory left on the free list for a desired allocation, malloc calls brk or sbrk which are the system calls requesting more memory pages from the operating system.

Now there are a few modification to optimize heap operations.

For large memory allocations (typically > 512 bytes, the heap manager may go straight to the OS and allocate a full memory page.
The heap may specify a minimum size of allocation to prevent large amounts of fragmentation.
The heap may also divide itself into bins one for small allocations and one for larger allocations to make larger allocations quicker.
There are also clever mechanisms for optimizing multi-threaded heap allocation.

answered Oct 18 '22 08:10

doron

Related questions
                            
                                How far can memory leaks go?
                            
                                How to read the content of a file to a string in C?
                            
                                How do I get bit-by-bit data from an integer value in C?
                            
                                Retrieve filename from file descriptor in C
                            
                                C/C++ line number
                            
                                What would be C++ limitations compared C language? [closed]
                            
                                Creating C formatted strings (not printing them)
                            
                                Do threads have a distinct heap?
                            
                                Difference between int32, int, int32_t, int8 and int8_t
                            
                                How is a CRC32 checksum calculated?
                            
                                Transform hexadecimal information to binary using a Linux command
                            
                                GDB corrupted stack frame - How to debug?
                            
                                Does C have a "foreach" loop construct?
                            
                                sizeof single struct member in C
                            
                                How to do scanf for single char in C [duplicate]
                            
                                Build a simple HTTP server in C [closed]
                            
                                const char * const versus const char *?
                            
                                Rolling median algorithm in C
                            
                                What does -fPIC mean when building a shared library?
                            
                                What is the difference between vmalloc and kmalloc?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is malloc() implemented internally? [duplicate]

Tags:

c

memory

malloc

system-calls

sbrk

bodacydo

People also ask

2 Answers

DarkDust

doron

Recent Activity

Donate For Us