Does OpenMP support an atomic minimum for C++11? If OpenMP has no portable method: Is there some way of doing it using a x86 or amd64 feature?
In the OpenMP specifications I found nothing for C++ but the Fortran version seems to support it. See 2.8.5 of the v3.1 for the details. For C++ it states
binop is one of +, *, -, /, &, ^, |, <<, or >>.
but for Fortran it states
intrinsic_procedure_name is one of MAX, MIN, IAND, IOR, or IEOR.
In case you are interested in more context: I am looking for a mutex free method of doing the following:
vector<omp_lock_t>lock;
vector<int>val;
#pragma omp parallel
{
// ...
int x = ...;
int y = ...;
if(y < val[x]){
omp_set_lock(&lock[x]);
if(y < val[x])
val[x] = y;
omp_unset_lock(&lock[x]);
}
}
I know that you can compute the minimum using a reduce algorithm. I know that there are circumstances where this largely outperforms any atomic minimum approach. However, I also know that this is not the case in my situation.
EDIT: One option that is slightly faster in my case is
int x = ...;
int y = ...;
while(y < val[x])
val[x] = y;
but that is no atomic operation.
All the newer GPUs have this feature and I am missing it on the CPU. (See atom_min for OpenCL.)
The OpenMP specification for C++ does not have support for atomic minimum. Neither does C++11.
I am assuming that in your algorithm, x
can compute to any valid index, regardless of thread.
I would suggest changing your algorithm, so that each thread uses its own val
array and then do a final reconciliation at the end, which can also be parallelized by index. This will avoid locks and atomics completely and give you the benefit of separating the data for each thread, i.e. no chance for false cache sharing. In other words, it should be faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With