Is there a convenient and efficient way to use the cpp standard container API in a NUMA aware fashion?
I would like to do an OpenMP parallel sparse Matrix Vector multiplication in a cpp environment. To allocate and initialize the vector and matrix values with regards to the NUMA domains, the C code would somehow look like this:
size_t N = 1000000;
double* vecVal = malloc (N*sizeof(double));
#pragma OMP parallel for
for (size_i=0; i<N; ++i)
{
vecVal[i] = 0.;
}
/* do spMV */
delete vecVal;
In Cpp I would like to use std::vector (std::array with a fixed size is also ok). Does std::vector::reserve() do the trick? Is it legal to do something like this:
std::vector<double> vec;
vec.reserve(N);
double *vecVal = vec.data();
#pragma OMP parallel for
for (size_i=0; i<N; ++i)
{
vecVal[i] = 0.;
}
/* do spMV */
How can I afterwards set the correct size to the std::vector?
Does anyone knows a more elegant way?
You have to use a special numa aware allocator here. We implemented something like this for HPX here: https://github.com/STEllAR-GROUP/hpx/blob/master/hpx/parallel/util/numa_allocator.hpp
The basic idea is to have the first touch inside of the allocators allocate function. replace the HPX executor stuff with your #omp parallel for schedule(static)
loop and you should be fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With