When we working on NUMA system, memory can be local or remote relative to current NUMA node. To make memory more local there is a "first-touch" policy (the default memory to node binding strategy): http://lse.sourceforge.net/numa/status/description.html
Default Memory Binding It is important that user programs' memory is allocated on a node close to the one containing the CPU on which they are running. Therefore, by default, page faults are satisfied by memory from the node containing the page-faulting CPU. Because the first CPU to touch the page will be the CPU that faults the page in, this default policy is called "first touch".
http://techpubs.sgi.com/library/dynaweb_docs/0640/SGI_Developer/books/OrOn2_PfTune/sgi_html/ch08.html
The default policy is called first-touch. Under this policy, the process that first touches (that is, writes to, or reads from) a page of memory causes that page to be allocated in the node on which the process is running. This policy works well for sequential programs and for many parallel programs as well.
There are also some other non-local policies. Also there is a function to require explicit move of memory segment to some NUMA node.
But sometimes (in context of many threads of single applications) it can be useful to have "next touch" policy: call some function to "unbind" some memory region (up to 100s MB) with some data and reapply the "first touch"-like handler on this region which will migrate the page on next touch (read or write) to the numa node of accessing thread.
This policy is useful in case when there are huge data to process by many threads and there are different patterns of access to this data (e.g. first phase - split the 2D array by columns via threads; second - split the same data by rows).
Such policy was supported in Solaris since 9 via madvice with MADV_ACCESS_LWP flag
https://cims.nyu.edu/cgi-systems/man.cgi?section=3C&topic=madvise
MADV_ACCESS_LWP Tell the kernel that the next LWP to touch the specified address range will access it most heavily, so the kernel should try to allocate the memory and other resources for this range and the LWP accordingly.
There was (may 2009) the patch to linux kernel named "affinity-on-next-touch", http://lwn.net/Articles/332754/ (thread) but as I understand it was unaccepted into mainline, isn't it?
Also there were Lee Schermerhorn's "migrate_on_fault" patches http://free.linux.hp.com/~lts/Patches/PageMigration/.
So, the question: Is there some next-touch for NUMA in current vanilla Linux kernel or in some major fork, like RedHat linux kernel or Oracle linux kernel?
Given my understanding, there aren't anything similar in the vanilla kernel. numactl has functions to migrate pages manually, but it's probably not helpful in your case. (NUMA policy description is in Documentation/vm/numa_memory_policy if you want to check yourself)
I think those patches are not merged as I don't see any of the relevant code snippets showing up in current kernel.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With