Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does malloc rely on mmap starting from a certain threshold?

I was reading a little bit about malloc and found the following in the malloc's man page:

Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using sbrk(2). When allocating blocks of memory larger than MMAP_THRESHOLD bytes, the glibc malloc() implementation allocates the memory as a private anonymous mapping using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable using mallopt(3). Allocations performed using mmap(2) are unaffected by the RLIMIT_DATA resource limit (see getrlimit(2)).

So basically starting from the threshold MMAP_THRESHOLD malloc start using mmap.

  1. Is there any reason to switch to mmap for large chunks?
  2. Could this hit the process execution performance?
  3. Does the mmap system call force a context switch?
like image 683
rkachach Avatar asked Oct 14 '15 14:10

rkachach


People also ask

Why does malloc use mmap?

For very large requests, malloc() uses the mmap() system call to find addressable memory space. This process helps reduce the negative effects of memory fragmentation when large blocks of memory are freed but locked by smaller, more recently allocated blocks lying between them and the end of the allocated space.

How is malloc related to mmap?

They are related in that they both can allocate “new” memory for use by user programs, and have a companion function, free() for malloc() and munmap() for mmap() that deallocates memory. In practice, however, they are rather different: malloc() “recycles” previously used memory that was released by free().

Is malloc faster than mmap?

The mmap code is faster because for your program, mmap has resulted in either less disk access, or more efficient disk access, than whatever reads and writes you compared against. For instance, write ing the whole file actually sends all those bytes to disk.

Does malloc use BRK?

strace , brk and sbrk Any program will always use a few syscalls before your main function is executed. In order to know which syscalls are used by malloc , we will add a write syscall before and after the call to malloc ( 3-main. c ). -> malloc is using the brk system call in order to manipulate the heap.


1 Answers

(1) Pages acquired via anonymous mmap can be released via munmap, which is what glibc is doing. So for small allocations, free returns memory to your process's heap (but retains them in the process's memory); for large allocations, free returns memory to the system as a whole.

(2) Pages acquired via anonymous mmap are not actually allocated until you access them the first time. At that point, the kernel has to zero them to avoid leaking information between processes. So, yes, the pages acquired by mmap are slower to access the first time than those recycled through your process's heap. Whether you will notice the difference depends on your application.

The cost of not using mmap is that freed memory is still tied up by your process and unavailable to other processes on the system. So this is ultimately a trade-off.

(3) It does not "force" a context switch and is, I believe, unlikely to cause one. mmap does not actually allocate the pages; it just manipulates the page map for your process. That should typically be a non-blocking operation. (Although I admit I am not 100% sure about this.)

like image 193
Nemo Avatar answered Nov 04 '22 15:11

Nemo