The way I understand the explanations on the web about the OpenMP atomic directive in C++ is that they apply to specific memory locations, designated by some variable (or its pointer?). So when I access this location on different lines of code within a parallelized for-loop, can I protect all of them or will atomic only protect one line of code and not look at other possible lines that access the same memory location?
For example, consider the following piece of code:
int N = 10000; // just some big number
float a[N]; // a big array
#pragma omp parallel for
for(int i = 1; i < N-1; i++) {
#pragma omp atomic
a[i-1] += 0.5f;
#pragma omp atomic
a[i] += 1.0f;
#pragma omp atomic
a[i+1] += 0.5f;
}
In every loop iteration, the same array is accessed at three points, at index i, i minus one and i plus one. In different threads, however, the i-1 line may evaluate to the same number as either the i or i+1 line. For instance when in thread 1 i==1 and in thread 2 i==3 the third (in thread 1) and first (in thread 2) array access lines will access the same array element, possibly at the same time.
Will atomic protect these different lines if they happen to access the same memory location? Or does it only apply to one line and would the only solution be to incorporate all three accesses into one line (e.g. by putting i-1, i and i+1 in a second array and making a second for-loop that loops over them)?
From the OpenMP standard 3.1 (section 2.8.5):
The atomic construct ensures that a specific storage location is accessed atomically, rather than exposing it to the possibility of multiple, simultaneous reading and writing threads that may result in indeterminate values.
So, to give you a brief answer to:
Will atomic protect these different lines if they happen to access the same memory location?
Yes, it will.
But let me elaborate a bit more. According to the standard, the syntax of the construct is the following:
#pragma omp atomic new-line
expression-stmt
where expression-stmt has the form:
x binop= expr
x++
++x
x--
--x
Of course you have a few restrictions:
x must be an lvalue expression with scalar typeexpr is an expression with scalar type, and it does not reference the object designated by xbinop is not an overloaded operator and is one of +, *, -, /, &, ^, |, <<, or >>x throughout the program are required to have a compatible typeAll these constraints are fulfilled by your snippet. In particular, point 4 is of no concern, as x is always a float in your code (in other words a[i] always returns a float). The usual example it is given to show a violation of point 4 is the use of a union, as in the link posted in other answers.
Edit: My first try at this failed in Visual Studio 2012. But I could not think of any reason it should not work. I changed the constants to be float instead of double (0.5f instead of 0.5). Now it works. So to answer your question you can use the atomic the way you did as long as you use the same data type (don't mix float and double). I learned this readying here http://msdn.microsoft.com/en-us/library/5fhhcxk3.aspx (All atomic references to the storage location x throughout the program are required to have a compatible type.)
void foo_v1(float *a, const int N) {
#pragma omp parallel for
for(int i = 1; i < N-1; i++) {
#pragma omp atomic
a[i-1] += 0.5f;
#pragma omp atomic
a[i] += 1.0f;
#pragma omp atomic
a[i+1] += 0.5f;
}
}
Below is my original answer before I realized your code was mixing types. It's a better solution anyway :-)
No, that's not going to work (see my correction above). You can check these things yourself by generating results with and without OpenMP and comparing and you will see it fails (see my correction above). You should make a table to see what's going on
a[0] a[1] a[2] a[3] a[4] a[5] ....
i=1 +=0.5 +=1.0 +=0.5
i=2 +=0.5 +=1.0 +=0.5
i=3 +=0.5 +=1.0 +=0.5
i=4 +=0.5 +=1.0 +=0.5
.
Notice that for a[2] through a[N-3] they are simply a[i] = 0.5 + 1.0 + 0.5. You can replace the constants (0.5, 1.0, 0.5) with an array of values if you want.
Use this code.
void foo_v2(float *a, const int N) {
a[0] += 0.5;
a[1] += 0.5 + 1.0;
a[N-1] += 0.5;
a[N-2] += 1.0 + 0.5;
#pragma omp parallel for
for(int i = 2; i < N-2; i++) {
a[i] += 0.5 + 1.0 + 0.5;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With