Is the following code guaranteed to return the expected value of counter (40,000,000), according to the C++11 memory model? (NOT limited to x86).
#include <atomic>
#include <thread>
using namespace std;
void ThreadProc(atomic<int>& counter)
{
for (int i = 0; i < 10000000; i++)
counter.fetch_add(1, memory_order_relaxed);
}
int main()
{
#define COUNT 4
atomic<int> counter = { 0 };
thread threads[COUNT] = {};
for (size_t i = 0; i < COUNT; i++)
threads[i] = thread(ThreadProc, ref(counter));
for (size_t i = 0; i < COUNT; i++)
threads[i].join();
printf("Counter: %i", counter.load(memory_order_relaxed));
return 0;
}
In particular, will relaxed atomics coordinate such that two threads will not simultaneously read the current value, independently increment it, and both write their incremented value, effectively losing one of the writes?
Some lines from the spec seem to indicate that counter must consistently be 40,000,000 in the above example.
[Note: operations specifying memory_order_relaxed are relaxed with respect to memory ordering. Implementations must still guarantee that any given atomic access to a particular atomic object be indivisible with respect to all other atomic accesses to that object. — end note
.
Atomic read-modify-write operations shall always read the last value (in the modification order) written the write associated with the read-modify-write operation.
.
All modifications to a particular atomic object M occur in some particular total order, called the modification order of M. If A and B are modifications of an atomic object M and A happens before (as defined below) B, then A shall precede B in the modification order of M, which is defined below.
This talk also supports the notion that the above code is race free. https://www.youtube.com/watch?v=KeLBd2EJLOU&feature=youtu.be&t=1h9m30s
It appears to me that there is an indivisible ordering of the atomic operations, but we have no guarantees what the order is. So all increments must take place 'one before the other' without the race I described above.
But then a few things potentially point in the other direction:
Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.
I've been informed by a coworker that there are known mistakes in Sutter's talk. Though I've yet to find any sources for this.
Multiple members of the C++ community smarter than I have implied that a relaxed atomic add could be buffered such that a subsequent relaxed atomic add could read and operator on the stale value.
The code in your question is race free; all increments are ordered and the outcome of 40000000 is guaranteed.
The references in your question contain all the relevant quotes from the standard.
The part where it says that atomic stores should be visible within a reasonable time applies only to single stores.
In your case, the counter is incremented with an atomic read-modift-write operation and those are guaranteed to operate on the latest in the modification order.
Multiple members of the C++ community (...) have implied that a relaxed atomic add could be buffered such that a subsequent relaxed atomic add could read and operator on the stale value.
This is not possible, as long as the modifications are based on atomic read-modify-write operations.
Atomic increments would be useless if a reliable outcome was not guaranteed by the standard
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With