Specifically, is there any effective difference between:
i = a.load(memory_order_acquire);
or
a.store(5, memory_order_release);
and
atomic_thread_fence(memory_order_acquire);
i = a.load(memory_order_relaxed);
or
a.store(5, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
respectively?
Do non-relaxed atomic accesses provide signal fences as well as thread fences?
In your code, for both load
and store
, the order between the fence and the atomic operation should be reversed and then it is similar to the standalone operations, but there are differences.
Acquire and release operations on atomic variables act as one-way barriers, but in opposite directions. That is, a store/release operation prevents memory operations that precede it (in the program source) from being reordered after it, while a load/acquire operation prevents memory operations that follow it from being reordered before it.
// thread 1
// shared memory operations A
a.store(5, std::memory_order_release);
x = 42; // regular int
// thread 2
while (a.load(std::memory_order_acquire) != 5);
// shared memory operations B
Memory operations A cannot move down below the store/release
, while memory operations B cannot move up above the load/acquire
.
As soon as thread 2 reads 5, memory operation A are visible to B and synchronization is complete.
Being a one-way barrier, the write to x
can join, or even precede, memory operations A, but since it is not part of the acquire/release relationship x
cannot be reliably accessed by thread 2.
Replacing the atomic operations with standalone thread fences and relaxed operations is similar:
// thread 1
// shared memory operations A
std::atomic_thread_fence(memory_order_release);
a.store(5, std::memory_order_relaxed);
// thread 2
while (a.load(std::memory_order_relaxed) != 5);
std::atomic_thread_fence(memory_order_acquire);
// shared memory operations B
This achieves the same result but an important difference is that both fences do not act as one-way barriers;
If they did, the atomic store to a
could be reordered before the release fence and the atomic load from a
could be reordered after the acquire fence and
that would break the synchronization relationship.
In general:
The standard allows Acquire/Release fences to be mixed with Acquire/Release operations.
Do non-relaxed atomic accesses provide signal fences as well as thread fences?
It is not fully clear to me what you are asking here because thread fences are normally used with relaxed atomic operations,
but std::thread_signal_fence
is similar to a std::atomic_thread_fence
, except that it is supposed to operate within the same thread and
therefore the compiler does not generate CPU instructions for inter-thread synchronization.
It basically acts as a compiler-only barrier.
You need
atomic_thread_fence(memory_order_release);
a.store(5, memory_order_relaxed);
and
i = a.load(memory_order_relaxed);
atomic_thread_fence(memory_order_acquire);
To replace
a.store(5, memory_order_release);
and
i = a.load(memory_order_acquire);
Non-relaxed atomic accesses do provide signal fences as well as thread fences.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With