I see many articles suggesting not to map huge files as mmap files so the virtual address space won't be taken solely by the mmap.
How does that change with 64 bit process where the address space dramatically increases? If I need to randomly access a file, is there a reason not to map the whole file at once? (dozens of GBs file)
Accessing memory mapped files is faster than using direct read and write operations for two reasons. Firstly, a system call is orders of magnitude slower than a simple change to a program's local memory.
Memory-mapped files cannot be larger than 2GB on 32-bit systems. When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified.
The principal benefits of memory-mapping are efficiency, faster file access, the ability to share memory between applications, and more efficient coding.
Using wide vector instructions for data copying effectively utilizes the memory bandwidth, and combined with CPU pre-fetching makes mmap really really fast.
On 64bit, go ahead and map the file.
One thing to consider, based on Linux experience: if the access is truly random and the file is much bigger than you can expect to cache in RAM (so the chances of hitting a page again are slim) then it can be worth specifying MADV_RANDOM
to madvise to stop the accumulation of hit file pages steadily and pointlessly swapping other actually useful stuff out. No idea what the windows equivalent API is though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With