std::atomic
functions such as store
and load
take an std::memory_order
argument. The argument can be determined at runtime, like any other function argument. However, the actual value may effect the optimization of the code during compilation. Consider the following:
std::atomic<int> ai1, ai2;
int value = whatever;
void foo() {
std::memory_order memOrd = getMemoryOrder();
register int v = value; // load value from memory
ai1.store(v, memOrd); // dependency on v's value
ai2.store(1, memOrd); // no dependency. could this be move up?
}
If memOrd
happens to be memory_order_relaxed
, the second store could safely be moved in front of the first one. This will add some extra work between loading value
and using it, which might prevent otherwise required stalls. However, if memOrd
is memory_order_seq_cst
, switching the stores should not be allowed, because some other thread might count on ai1
being already set to value
if ai2
is set to 1.
What I'm wondering is why was the memory order defined as a runtime argument rather than compile time. Is there any reason for someone to examine the environment at runtime before deciding the best memory operations semantics?
The default is std::memory_order_seq_cst which establishes a single total ordering over all atomic operations tagged with this tag: all threads see the same order of such atomic operations and no memory_order_seq_cst atomic operations can be reordered.
memory_order_acquire: Syncs reading this atomic variable AND makes sure relaxed vars written before this are synced as well. (does this mean all atomic variables on all threads are synced?) memory_order_release: Pushes the atomic store to other threads (but only if they read the var with consume/acquire)
The reason this is implemented as a runtime parameter rather than a compile-time parameter is to enable composition.
Suppose you are writing a function that uses the provided atomic operations to do the equivalent of a load operation, but operating on a higher level construct. By having the memory order specified as a runtime parameter the higher level load can then pass a memory order parameter supplied by the user to the low-level atomic operation that is required to provide the ordering without the higher level operation having to be a template.
Typically, the atomic instructions will be inline, and the compiler will eliminate the test of the memory order parameter in the case that it is actually a compile-time constant.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With