I am trying to understand how mmap
works while looking at man mmap.
As I understand it, it adds a mapping to the page table that maps between the file and the virtual address (which is the address that is given void *addr
)
So, what happens when 2 programs map the same file? Are there 2 entries in the page table, one for each program?
So, what happens when 2 programs map the same file? Are there 2 entries in the page table, one for each program?
In modern operating systems, each process has its own page table for its memory, that may point to pages of physical memory shared with other user and kernel processes.
With
MAP_SHARED
, this mapping is shared: updates to the mapping are visible to other processes that map this file, and are carried through to the underlying file. The file may not actually be updated until msync(2) or munmap() is called.
This seems very interesting, but there are numerous caveats:
The actual pages mmapped by both processes for the same file may reside at the same address or at a different address in each process, storing pointers into this shared memory may not allow the other process to use them as they might point to inconsistent addresses.
The implementation may use the same physical memory pages for both mappings or not: for subtile reasons (cache strategies, out of sync reading...), even if it is the same physical memory, modifications done by one process to its memory may not be immediately reflected in the memory of the other process.
So the modification may or may not be visible to the other processes mmapping the file nor reading it via read
or the FILE*
stream API.
If one of the processes calls msync()
, the modifications should be visible in all maps and for all yet unread portions of the file, bearing in mind that the FILE*
streaming APIs may have buffered some data in internal unshared buffers: modifications in this area will not be reflected.
Conclusion: it is risky and unreliable to use these mechanisms to implement inter process communication. The behavior may depend on system specific characteristics such as the OS strategies, the CPU and cache architectures, the type of RAM in use, the clock speed, and who knows what else. It is safer to rely on proven APIs that may indeed be implemented using mmapped memory, but only if it is know to provide the correct semantics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With