mmap versus memory allocated with new

Question

I have a BitVector class that can either allocate memory dynamically using new or it can mmap a file. There isn't a noticeable difference in performance when using it with small files, but when using a 16GB file I have found that the mmap file is far slower than the memory allocated with new. (Something like 10x slower or more.) Note that my machine has 64GB of RAM.

The code in question is loading values from a large disk file and placing them into a Bloom filter which uses my BitVector class for storage.

At first I thought this might be because the backing for the mmap file was on the same disk as the file I was loading from, but this didn't seem to be the issue. I put the two files on two physically different disks, and there was no change in performance. (Although I believe they are on the same controller.)

Then, I used mlock to try to force everything into RAM, but the mmap implementation was still really slow.

So, for the time being I'm just allocating the memory directly. The only thing I'm changing in the code for this comparison is a flag the BitVector constructor.

Note that to measure performance I'm both looking at top and watching how many states I can add into the Bloom filter per second. The CPU usage doesn't even register on top when using mmap - although jbd2/sda1-8 starts to move up (I'm running on an Ubuntu server), which looks to be a process that is dealing with journaling for the drive. The input and output files are stored on two HDDs.

Can anyone explain this huge difference in performance?

Thanks!

Arunmu · Accepted Answer

Just to start with, mmap is an system call or interface provided to access the Virtual Memory of the system.
Now, in linux (I hope you are working on *nix) a lot of performance improvement is acheived by lazy loading or more commonly known as Copy-On-Write.

For mmap as well, this kind of lazy loading is implemented.

What happens is, when you call mmap on a file, kernel does not immediately allocate main memory pages for the file to be mapped.
Instead, it waits for the program to write/read from the illusionary page, at which stage, a page fault occurs, and the corresponding interrupt handler will then actually load that particular file part that can be held in that page frame (Also the page table is updated, so that next time, when you are reading/writing to same page, it is pointing to a valid frame).

Now, you can control this behavior with mlock, madvise, MAP_POPULATE flag with mmap etc.
MAP_POPULATE flags with mmap, tells the kernel to map the file to memory pages before the call returns rather than page faulting every time you access a new page.So, till the file is loaded, the function will be blocked.

From the Man Page:

MAP_POPULATE (since Linux 2.5.46)
              Populate (prefault) page tables for a mapping.  For a file
              mapping, this causes read-ahead on the file.  Later accesses
              to the mapping will not be blocked by page faults.
              MAP_POPULATE is supported for private mappings only since
              Linux 2.6.23.

mmap versus memory allocated with new

Tags:

c++

performance

memory-management

Nathan S.

1 Answers

Arunmu

Recent Activity

Donate For Us

mmap versus memory allocated with new

Tags:

c++

performance

memory-management

Nathan S.

1 Answers

Arunmu

Related questions

Recent Activity

Donate For Us