So, I understand that if you need some dynamically allocated memory, you can use malloc(). For example, your program reads a variable length file into a char[]. You don't know in advance how big to make your array, so you allocate the memory in runtime.
I'm trying to understand when you would use mmap(). I have read the man page and to be honest, I don't understand what the use case is.
Can somebody explain a use case to me in simple terms? Thanks in advance.
The mmap() function is used for mapping between a process address space and either files or devices. When a file is mapped to a process address space, the file can be accessed like an array in the program.
In short, mmap() is great if you're doing a large amount of IO in terms of total bytes transferred; this is because it reduces the number of copies needed, and can significantly reduce the number of kernel entries needed for reading cached data.
MMAP's perceived ease of use has seduced database management system (DBMS) developers for decades as a viable alternative to implementing a buffer pool. There are, however, severe correctness and performance issues with MMAP that are not immediately apparent.
Mmap is advantageous over malloc because memory used up by mmap is immediately returned to the OS. The memory used up by malloc is never returned unless there is a data segment break. This memory is specially kept to be reused.
mmap
can be used for a few things. First, a file-backed mapping. Instead of allocating memory with malloc
and reading the file, you map the whole file into memory without explicitly reading it. Now when you read from (or write to) that memory area, the operations act on the file, transparently. Why would you want to do this? It lets you easily process files that are larger than the available memory using the OS-provided paging mechanism. Even for smaller files, mmapping reduces the number of memory copies.
mmap
can also be used for an anonymous mapping. This mapping is not backed by a file, and is basically a request for a chunk of memory. If that sounds similar to malloc
, you are right. In fact, most implementations of malloc
will internally use an anonymous mmap
to provide a large memory area.
Another common use case is to have multiple processes map the same file as a shared mapping to obtain a shared memory region. The file doesn't have to be actually written to disk. shm_open
is a convenient way to make this happen.
Whenever you need to read/write blocks of data of a fixed size it's much simpler (and faster) to simply map the data file on disk to memory using mmap and acess it directly rather than allocate memory, read the file, access the data, potentially write the data back to disk, and free the memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With