Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why use mmap over fread?

Why/when is it better to use mmap(), as opposed to fread()'ing from a filestream in chunks into a byte array?

uint8_t my_buffer[MY_BUFFER_SIZE];
size_t bytes_read;
bytes_read = fread(my_buffer, 1, sizeof(my_buffer), input_file);
if (MY_BUFFER_SIZE != bytes_read) {
    fprintf(stderr, "File read failed: %s\n", filepath);
    exit(1);
}
like image 502
tarabyte Avatar asked May 21 '15 14:05

tarabyte


1 Answers

There are advantages to mapping a file instead of reading it as a stream:

  • If you intend to perform random access to different widely-spaced areas of the file, mapping might mean that only the pages you access need to be actually read, while keeping your code simple.

  • If multiple applications are going to be accessing the same file, mapping it means that it will only be read into memory once, as opposed to the situation where each application loads [part of] the file into its own private buffers.

  • If the file doesn't fit in memory or would take a large chunk of memory, mapping it can supply the illusion that it fits and simplify your program logic, while letting the operating system decide how to manage rotating bits of the file in and out of physical memory.

  • If the file contents change, you MAY get to see the new contents automatically. (This can be a dubious advantage.)

There are disadvantages to mapping the file:

  • If you only need sequential access to the file or it is small or you only need access to a small portion of it, the overhead of setting up a memory mapping and then incurring page faults to actually cause the contents to be read can be less efficient than just reading the file.

  • If there is an I/O error reading the file, your application will most likely be killed on the spot instead of receiving a system call error to which your application can react gracefully. (Technically you can catch the SIGBUS in the former case but recovering properly from that kind of thing is not easy.)

  • If you are not using a 64-bit architecture and the file is very large, there might not be enough address space to map it.

  • mmap() is less portable than read() (of fread() as you suggest).

  • mmap() will only work on regular files (on some filesystems) and some block devices.

like image 180
Celada Avatar answered Sep 22 '22 03:09

Celada