Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to portably extend a file accessed using mmap()

Tags:

linux

macos

mmap

We're experimenting with changing SQLite, an embedded database system, to use mmap() instead of the usual read() and write() calls to access the database file on disk. Using a single large mapping for the entire file. Assume that the file is small enough that we have no trouble finding space for this in virtual memory.

So far so good. In many cases using mmap() seems to be a little faster than read() and write(). And in some cases much faster.

Resizing the mapping in order to commit a write-transaction that extends the database file seems to be a problem. In order to extend the database file, the code could do something like this:

  ftruncate();    // extend the database file on disk    munmap();       // unmap the current mapping (it's now too small)   mmap();         // create a new, larger, mapping 

then copy the new data into the end of the new memory mapping. However, the munmap/mmap is undesirable as it means the next time each page of the database file is accessed a minor page fault occurs and the system has to search the OS page cache for the correct frame to associate with the virtual memory address. In other words, it slows down subsequent database reads.

On Linux, we can use the non-standard mremap() system call instead of munmap()/mmap() to resize the mapping. This seems to avoid the minor page faults.

QUESTION: How should this be dealt with on other systems, like OSX, that do not have mremap()?


We have two ideas at present. And a question regarding each:

1) Create mappings larger than the database file. Then, when extending the database file, simply call ftruncate() to extend the file on disk and continue using the same mapping.

This would be ideal, and seems to work in practice. However, we're worried about this warning in the man page:

"The effect of changing the size of the underlying file of a mapping on the pages that correspond to added or removed regions of the file is unspecified."

QUESTION: Is this something we should be worried about? Or an anachronism at this point?

2) When extending the database file, use the first argument to mmap() to request a mapping corresponding to the new pages of the database file located immediately after the current mapping in virtual memory. Effectively extending the initial mapping. If the system can't honour the request to place the new mapping immediately after the first, fall back to munmap/mmap.

In practice, we've found that OSX is pretty good about positioning mappings in this way, so this trick works there.

QUESTION: if the system does allocate the second mapping immediately following the first in virtual memory, is it then safe to eventually unmap them both using a single big call to munmap()?

like image 915
Dan Kennedy Avatar asked Mar 28 '13 14:03

Dan Kennedy


People also ask

What is mmap () used for?

The mmap() function establishes a mapping between a process' address space and a stream file. The address space of the process from the address returned to the caller, for a length of len, is mapped onto a stream file starting at offset off.

How does mmap work for files?

mmap works by manipulating your process's page table, a data structure your CPU uses to map address spaces. The CPU will translate "virtual" addresses to "physical" ones, and does so according to the page table set up by your kernel. When you access the mapped memory for the first time, your CPU generates a page fault.

How do I resize mmap?

The simple way is to use two MAP_SHARED mappings (grow the file, then create a second mapping that includes the grown region) in the same process over the same file and then unmap the old mapping once all readers that could access it are finished.

What are the advantages of using mmap ()?

Advantages of mmap( ) Aside from any potential page faults, reading from and writing to a memory-mapped file does not incur any system call or context switch overhead. It is as simple as accessing memory. When multiple processes map the same object into memory, the data is shared among all the processes.


1 Answers

2 will work but you don't have to rely on the OS happening to have space available, you can reserve your address space beforehand so your fixed mmapings will always succeed.

For instance, To reserve one gigabyte of address space. Do a

mmap(NULL, 1U << 30, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);

Which will reserve one gigabyte of continuous address space without actually allocating any memory or resources. You can then perform future mmapings over this space and they will succeed. So mmap the file into the beginning of the space returned, then mmap further sections of the file as needed using the fixed flag. The mmaps will succeed because your address space is already allocated and reserved by you.

Note: linux also has the MAP_NORESERVE flag which is the behavior you would want for the initial mapping if you were allocating RAM, but in my testing it is ignored as PROT_NONE is sufficient to say you don't want any resources allocated yet.

like image 193
John Meacham Avatar answered Oct 09 '22 04:10

John Meacham