Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

appending to a memory-mapped file

Tags:

I'm constantly appending to a file of stock quotes (ints, longs, doubles, etc.). I have this file mapped into memory with mmap.

What's the most efficient way to make newly appended data available as part of the memory mapping?

I understand that I can open the file again (new file descriptor) and then mmap it to get the new data but that seems to be inefficient. Another approach that has been suggested to me is to pre-allocate the file in 1mb chunks, write to a specific position until reaching the end then ftruncate the file to +1mb.

Are there other approaches?

Doest Boost help with this?

like image 390
Joel Reymont Avatar asked Dec 16 '10 11:12

Joel Reymont


People also ask

Is writing to a memory-mapped file faster?

Performance: Memory mapped writing is often fast as no stream/file buffers are used. OS does the actual file writing, usually in blocks of several kilo bytes at once. One downside would be, unless you're writing sequentially there could be page faults slowing down your program.

How do I use a memory-mapped file?

To work with a memory-mapped file, you must create a view of the entire memory-mapped file or a part of it. You can also create multiple views to the same part of the memory-mapped file, thereby creating concurrent memory. For two views to remain concurrent, they have to be created from the same memory-mapped file.

What is meant by memory-mapped?

What Is Memory-Mapping? Memory-mapping is a mechanism that maps a portion of a file, or an entire file, on disk to a range of addresses within an application's address space. The application can then access files on disk in the same way it accesses dynamic memory.

Does MMAP load file into memory?

Yes. The whole point of mmap is that is manages memory more efficiently than just slurping everything into memory.


Video Answer


2 Answers

Boost.IOStreams has fixed-size only memory mapped files, so it won't help with your specific problem. Linux has an interface mremap which works as follows:

void *new_mapping = mremap(mapping, size, size + GROWTH, MREMAP_MAYMOVE); if (new_mapping == MAP_FAILED)     // handle error mapping = new_mapping; 

This is non-portable, however (and poorly documented). Mac OS X seems not to have mremap.

In any case, you don't need to reopen the file, just munmap it and mmap it again:

void *append(int fd, char const *data, size_t nbytes, void *map, size_t &len) {     // TODO: check for errors here!     ssize_t written = write(fd, data, nbytes);     munmap(map, len);     len += written;     return mmap(NULL, len, PROT_READ, 0, fd, 0); } 

A pre-allocation scheme may be very useful here. Be sure to keep track of the file's actual length and truncate it once more before closing.

like image 91
Fred Foo Avatar answered Sep 19 '22 16:09

Fred Foo


I know the answer has already been accepted but maybe it will help someone else if I provide my answer. Allocate a large file ahead of time, say 10 GiB in size. Create three of these files ahead of time, I call them volumes. Keep track of your last known location somewhere like in the header, another file, etc. and then keep appending from that point. If you reach the maximum size of the file and run out of room switch to the next volume. If there are no more volumes, create another volume. Note that you would probably do this a few volumes ahead to make sure not to block your appends waiting for a new volume to be created. That's how we implement it where I work for storing continuous incoming video/audio in a DVR system for surveillance. We don't waste space to store file names for video clips which is why we don't use a real file system and instead we go flat file and we just track offsets, frame information (fps, frame type, width/height, etc), time recorded and camera channel. For you storage space is cheap for the kind of work you are doing, whereas your time is invaluable. So, grab as much as you want to ahead of time. You're basically implementing your own file system optimized for your needs. The needs that general-use file systems supply aren't the same needs that we need in other fields.

like image 20
blakesteel Avatar answered Sep 18 '22 16:09

blakesteel