I want to implement the following function that marks some elements of array by 1.
void mark(std::vector<signed char>& marker)
{
#pragma omp parallel for schedule(dynamic, M)
for (int i = 0; i < marker.size; i++)
marker[i] = 0;
#pragma omp parallel for schedule(dynamic, M)
for (int i = 0; i < marker.size; i++)
marker[getIndex(i)] = 1; // is it ok ?
}
What will happen if we try to set value of the same element to 1 in different threads at the same time? Will it be normally set to 1 or this loop may lead to unexpected behavior?
This answer is wrong in one fundamental part (emphasis mine):
If you write with different threads to the very same location, you get a race condition. This is not necessarily undefined behaviour, but nevertheless it need to be avoided.
Having a look at the OpenMP standard, section 1.4.1 says (also emphasis mine):
If multiple threads write without synchronization to the same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. Similarly, if at least one thread reads from a memory unit and at least one thread writes without synchronization to that same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. If a data race occurs then the result of the program is unspecified.
Technically the OP snippet is in the undefined behavior realm. This implies that there is no guarantee on the behavior of the program until the UB is removed from it.
The simplest way to do it is to protect memory access with an atomic operation:
#pragma omp parallel for schedule(dynamic, M)
for (int i = 0; i < marker.size; i++)
#pragma omp atomic write seq_cst
marker[getIndex(i)] = 1;
but that will probably hinder performance in a sensible way (as was correctly noted by @schorsch312).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With