Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading huge files using Memory Mapped Files

I see many articles suggesting not to map huge files as mmap files so the virtual address space won't be taken solely by the mmap.

How does that change with 64 bit process where the address space dramatically increases? If I need to randomly access a file, is there a reason not to map the whole file at once? (dozens of GBs file)

like image 699
Saar Avatar asked Mar 07 '12 20:03

Saar


People also ask

Are memory mapped files faster?

Accessing memory mapped files is faster than using direct read and write operations for two reasons. Firstly, a system call is orders of magnitude slower than a simple change to a program's local memory.

How big are memory mapped files?

Memory-mapped files cannot be larger than 2GB on 32-bit systems. When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified.

What are the advantages of memory mapped files?

The principal benefits of memory-mapping are efficiency, faster file access, the ability to share memory between applications, and more efficient coding.

Why is mmap faster than read?

Using wide vector instructions for data copying effectively utilizes the memory bandwidth, and combined with CPU pre-fetching makes mmap really really fast.


1 Answers

On 64bit, go ahead and map the file.

One thing to consider, based on Linux experience: if the access is truly random and the file is much bigger than you can expect to cache in RAM (so the chances of hitting a page again are slim) then it can be worth specifying MADV_RANDOM to madvise to stop the accumulation of hit file pages steadily and pointlessly swapping other actually useful stuff out. No idea what the windows equivalent API is though.

like image 94
timday Avatar answered Oct 19 '22 20:10

timday