I was wondering whether you could do multithreaded writes to a single file by using memory-mapped files, and making sure that two threads don't write to the same area (e.g. by interleaving fixed-size records), thus alleviating the need for synchronization at the application level, i.e. without using critical sections or mutexes in my code.
However, after googling for a bit, I'm still not sure. This link from Microsoft says:
First, there is an obvious savings of resources because both processes share both the physical page of memory and the page of hard disk storage used to back the memory-mapped file. Second, there is only one set of data, so all views are always coherent with one another. This means that changes made to a page in the memory-mapped file via one process's view are automatically reflected in a common view of the memory-mapped file in another process. Essentially, Windows NT is not required to do any special bookkeeping to ensure the integrity of data to both applications.
But does it apply to threads belonging to the same process? It would be plausible (since my writes are disjoint), but I don't know enough about the underlying implementation of memory mapping (e.g. what book-keeping the OS does) to be sure.
Example use case, where myFunction
is executed by each thread:
// crt - index of current thread, in 0..n-1
// n - thread count
// memArea - pointer to memory location obtained from mapping a file
void myFunction(int crt, int n, int*memArea){
for (int i=1; i<512; i++)
memArea[ ( sizeof(int)*( n*i + crt ) ] = n*i+crt;
}
If I were to run this, wait for the threads to finish, unmap the file and exit, would I end up with a file containing consecutive integers?
I would be grateful for an informed answer.
Benefits. The benefit of memory mapping a file is increasing I/O performance, especially when used on large files. For small files, memory-mapped files can result in a waste of slack space as memory maps are always aligned to the page size, which is mostly 4 KiB.
In my performance tests, I'm seeing that reading from memory mapped files is 30X faster than reading through regular c++ stdio.
Benefits of Memory Mapped Files Accessing RAM is faster than disk I/O operation and hence a performance boost is achieved when dealing with extremely large files. Memory mapped files also offer lazy loading which equated to using a small amount of RAM for even a large file.
You'll need to add the synchronization regardless if the MMF view is accessed from multiple processes or multiple threads inside one process. Fwiw, it doesn't make any sense to use an MMF for memory sharing inside one process. Threads already share the address space.
But does it apply to threads belonging to the same process?
Yes. If one thread changes part of the data in the mapping, then all other threads immediately see that change.
You need to ensure the threads coordinate their changes so no thread is accessing an inconsistent view (eg. all access is via a critical section).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With