Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there NUMA next-touch policy in modern Linux

When we working on NUMA system, memory can be local or remote relative to current NUMA node. To make memory more local there is a "first-touch" policy (the default memory to node binding strategy): http://lse.sourceforge.net/numa/status/description.html

Default Memory Binding It is important that user programs' memory is allocated on a node close to the one containing the CPU on which they are running. Therefore, by default, page faults are satisfied by memory from the node containing the page-faulting CPU. Because the first CPU to touch the page will be the CPU that faults the page in, this default policy is called "first touch".

http://techpubs.sgi.com/library/dynaweb_docs/0640/SGI_Developer/books/OrOn2_PfTune/sgi_html/ch08.html

The default policy is called first-touch. Under this policy, the process that first touches (that is, writes to, or reads from) a page of memory causes that page to be allocated in the node on which the process is running. This policy works well for sequential programs and for many parallel programs as well.

There are also some other non-local policies. Also there is a function to require explicit move of memory segment to some NUMA node.

But sometimes (in context of many threads of single applications) it can be useful to have "next touch" policy: call some function to "unbind" some memory region (up to 100s MB) with some data and reapply the "first touch"-like handler on this region which will migrate the page on next touch (read or write) to the numa node of accessing thread.

This policy is useful in case when there are huge data to process by many threads and there are different patterns of access to this data (e.g. first phase - split the 2D array by columns via threads; second - split the same data by rows).

Such policy was supported in Solaris since 9 via madvice with MADV_ACCESS_LWP flag

https://cims.nyu.edu/cgi-systems/man.cgi?section=3C&topic=madvise

MADV_ACCESS_LWP Tell the kernel that the next LWP to touch the specified address range will access it most heavily, so the kernel should try to allocate the memory and other resources for this range and the LWP accordingly.

There was (may 2009) the patch to linux kernel named "affinity-on-next-touch", http://lwn.net/Articles/332754/ (thread) but as I understand it was unaccepted into mainline, isn't it?

Also there were Lee Schermerhorn's "migrate_on_fault" patches http://free.linux.hp.com/~lts/Patches/PageMigration/.

So, the question: Is there some next-touch for NUMA in current vanilla Linux kernel or in some major fork, like RedHat linux kernel or Oracle linux kernel?

like image 512
osgx Avatar asked Aug 30 '12 12:08

osgx


1 Answers

Given my understanding, there aren't anything similar in the vanilla kernel. numactl has functions to migrate pages manually, but it's probably not helpful in your case. (NUMA policy description is in Documentation/vm/numa_memory_policy if you want to check yourself)

I think those patches are not merged as I don't see any of the relevant code snippets showing up in current kernel.

like image 77
wujj123456 Avatar answered Jan 05 '23 02:01

wujj123456