My program works with large data sets that need to be stored in contiguous memory (several Gigabytes). Allocating memory using std::allocator
(i.e. malloc
or new
) causes system stalls as large portions of virtual memory are reserved and physical memory gets filled up.
Since the program will mostly only work on small portions at a time, my question is if using memory mapped files would provide an advantage (i.e. mmap
or the Windows equivalent.) That is creating a large sparse temporary file and mapping it to virtual memory. Or is there another technique that would change the system's pagination strategy such that less pages are loaded into physical memory at a time.
I'm trying to avoid building a streaming mechanism that loads portions of a file at a time and instead rely on the system's vm pagination.
Yes, mmap
has the potential to speed things up.
Things to consider:
malloc
and free
will use mmap
with MAP_ANON
anyway. So the difference in memory mapping a file is simply that you are getting the VMM to do the I/O for you.madvise
with mmap
to assist the VMM in paging well.open
and read
(plus, as erenon suggests, posix_fadvise
), your file is still held in buffers anyway (i.e. it's not immediately written out) unless you also use O_DIRECT
. So in both situations, you are relying on the kernel for I/O scheduling.If the data is already in a file, it would speed up things, especially in the non-sequential case. (In the sequential case, read
wins)
If using open
and read
, consider using posix_fadvise as well.
This really depends on your mmap()
implementation. Mapping a file into memory has several advantages that can be exploited by the kernel:
The kernel knows that the contents of the mmap()
pages is already present on disk. If it decides to evict these pages, it can omit the write back.
You reduce copying operations: read()
operations typically first read the data into kernel memory, then copy it over to user space.
The reduced copies also mean that less memory is used to store data from the file, which means more memory is available for other uses, which can reduce paging as well.
This is also, why it is generally a bad idea to use large caches within an I/O library: Modern kernels already cache everything they ever read from disk, caching a copy in user space means that the amount of data that can be cached is actually reduced.
Of course, you also avoid a lot of headaches that result from buffering data of unknown size in your application. But that is just a convenience for you as a programmer.
However, even though the kernel can exploit these properties, it does not necessarily do so. My experience is that LINUX mmap()
is generally fine; on AIX, however, I have witnessed really bad mmap()
performance. So, if your goal is performance, it's the old measure-compare-decide stand by.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With