If I have a pointer and I care about memory access performance I may check whether the next operation on it will trigger a page fault. If it will, an algorithm could be designed so that it reorders loop operations to minimize page faults.
Is there any portable (or linux/windows non-portable) way to check for a particular memory address whether access will trigger a page fault?
Most of these abstractions intentionally obscure something central to storage: the address in memory where something is stored. Pointers are a way to get closer to memory and to manipulate the contents of memory directly. In this chapter, we will discuss pointers and how pointers are used to work with memory.
If we send a pointer to memory to a function, any changes to the pointer itself will be ignored, but the function can dereference the pointer and make changes to memory that the pointer references. That memory is not part of the parameter list of the function and those changes will be reflected back to the caller.
This is especially useful when a pointer points to the beginning of an allocated area in memory. Let's say that we have code that just allocated space in memory for 20 integers: By dereferencing the pointer, we gain access to the first integer in the space.
These pointers do not point to anything and trying to free them will cause a program error. Sometimes, memory is allocated by function calls within functions. That kind of allocation is usually freed up by a companion function to the function that allocated the space. (See below in "Pointers and Pebble Programming" for examples.)
About ten years ago, Emery Berger proposed a VM-aware garbage collection strategy which required the application to know which pages were present in memory. For testing purposes, he and his students produced a kernel patch which notified the application of paging events using real-time signals, allowing the garbage collector to maintain its own database of resident pages. (Although that seems like duplication of effort, it is a lot more efficient than multiple system calls in order to obtain information every time it is needed.)
You can find information about this interesting research on his research page.
As far as I know, there is no implementation of this patch for a recent Linux kernel, but it would always be possible to resurrect it.
On Linux there is a mechanism, see man proc
:
/[pid]/pagemap
This file shows the mapping of each of the process's virtual pages into physical page frames or swap area. It contains one 64-bit value for each virtual page, with the bits set as follows:
- 63 If set, the page is present in RAM.
- 62 If set, the page is in swap space
- ...
For example,
$ sudo hexdump -e '/0 "%08_ax "' -e '/8 "%016X" "\n"' /proc/self/pagemap
00000000 0600000000000000
*
00002000 A6000000000032FE
00002008 A60000000007F3A6
00002010 A600000000094560
00002018 A60000000008D0C0
00002020 A60000000009EBE6
00002028 A6000000000C8E87
I wrote the page-info library to do this on Linux. It uses the pagemap file under the covers so won't be portable to other OSes.
Some information is restricted to root users, but you should be able to get the information about page presence (whether it is in RAM or not) without being root. Quoting from the README:
So [as a non-root user] you can determine if a page is present, swapped out, its soft-dirty status, whether it is exclusive and whether it is a file mapping, but not much more. On older kernels, you can also get the physical frame number (the pfn) field, which is essentially the physical address of the page (shifted right by 12).
The performance isn't exactly optimized for querying large ranges as it does a separate read for each page, but a PR to improve this would be greatfully accepted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With