atomic.load() with std::memory_order_release

Question

When writing C++11 code that uses the newly introduced thread-synchronization primitives to make use of the relaxed memory ordering, you usually see either

std::atomic<int> vv;
int i = vv.load(std::memory_order_acquire);

or

vv.store(42, std::memory_order_release);

It is clear to me why this makes sense.

My questions are: Do the combinations vv.store(42, std::memory_order_acquire) and vv.load(std::memory_order_release) also make sense? In which situation could one use them? What are the semantics of these combinations?

Mat · Accepted Answer

That's simply not allowed. The C++ (11) standard has requirements on what memory order constraints you can put on load/store operations.

For load (§29.6.5):

Requires: The order argument shall not be memory_order_release nor memory_order_acq_rel.

For store:

Requires: The order argument shall not be memory_order_consume, memory_order_acquire, nor memory_order_acq_rel.

Kaz Wesley · Answer

The C/C++/LLVM memory model is sufficient for synchronization strategies that ensure data is ready to be accessed before accessing it. While that covers most common synchronization primitives, useful properties can be obtained by building consistent models on weaker guarantees.

The biggest example is the seqlock. It relies on "speculatively" reading data that may not be in a consistent state. Because reads are allowed to race with writes, readers don't block writers -- a property which is used in the Linux kernel to allow the system clock to be updated even if a user process is repeatedly reading it. Another strength of the seqlock is that on modern SMP arches it scales perfectly with the number of readers: because the readers don't need to take any locks, they only need shared access to the cache lines.

The ideal implementation of a seqlock would use something like a "release load" in the reader, which is not available in any major programming language. The kernel works around this with a full read fence, which scales well across architectures, but doesn't achieve optimal performance.

atomic<T>.load() with std::memory_order_release

Tags:

c++

multithreading

c++11

memory-model

stdatomic

Toby Brull

2 Answers

Mat

Kaz Wesley

Recent Activity

Donate For Us