If I have a multi-processor board that has cache-coherent non-uniform memory access ( NUMA ), i.e. separate "northbridges" with separate RAM for each processor, does any compiler know how to automatically spread the data across the different memory systems such that processes working on local threads are mostly retrieving their data from the RAM associated with the processor the thread is running on?
I have a setup where 1 GB is attached to processor 0, 1 GB is attached to processor 1, et c. up to 4 processors. In the coherent memory space the physical memory for the RAM on the 1st processor is addresses 0 to 1GB-1. For the second processor it is 1GB to 2GB-1, and so on.
Will any compilers, or perhaps malloc
specifically, associate new memory alloc'd by a process on a specific core to the physical RAM associated with that core?
gcc is used to compile C program. g++ can compile any . c or . cpp files but they will be treated as C++ files only.
GCC has experimental support for the latest revision of the C++ standard, which was published in 2020. C++20 features are available since GCC 8.
Linux kernel knows about NUMA and will try to give your process pages from memory local to the current CPU (source: U. Drepper, "What Every Programmer Should Know About Memory".)
NUMA-aware memory allocation is not done at compile time. Making assumptions like this would be bad for portability.
On Linux, this is a kernel function, though you can control this at runtime with numactl
or set_mempolicy
or with libnuma
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With