Acquire/Release versus Sequentially Consistent memory order

Tags:

For any std::atomic<T> where T is a primitive type:

If I use std::memory_order_acq_rel for fetch_xxx operations, and std::memory_order_acquire for load operation and std::memory_order_release for store operation blindly (I mean just like resetting the default memory ordering of those functions)

Will the results be same as if I used std::memory_order_seq_cst (which is being used as default) for any of the declared operations?
If the results were the same, is this usage anyhow different than using std::memory_order_seq_cst in terms of efficiency?

501

asked Feb 13 '13 19:02

zahir

1 Answers

The C++11 memory ordering parameters for atomic operations specify constraints on the ordering. If you do a store with std::memory_order_release, and a load from another thread reads the value with std::memory_order_acquire then subsequent read operations from the second thread will see any values stored to any memory location by the first thread that were prior to the store-release, or a later store to any of those memory locations.

If both the store and subsequent load are std::memory_order_seq_cst then the relationship between these two threads is the same. You need more threads to see the difference.

e.g. std::atomic<int> variables x and y, both initially 0.

Thread 1:

x.store(1,std::memory_order_release);

Thread 2:

y.store(1,std::memory_order_release);

Thread 3:

int a=x.load(std::memory_order_acquire); // x before y int b=y.load(std::memory_order_acquire);

Thread 4:

int c=y.load(std::memory_order_acquire); // y before x int d=x.load(std::memory_order_acquire);

As written, there is no relationship between the stores to x and y, so it is quite possible to see a==1, b==0 in thread 3, and c==1 and d==0 in thread 4.

If all the memory orderings are changed to std::memory_order_seq_cst then this enforces an ordering between the stores to x and y. Consequently, if thread 3 sees a==1 and b==0 then that means the store to x must be before the store to y, so if thread 4 sees c==1, meaning the store to y has completed, then the store to x must also have completed, so we must have d==1.

In practice, then using std::memory_order_seq_cst everywhere will add additional overhead to either loads or stores or both, depending on your compiler and processor architecture. e.g. a common technique for x86 processors is to use XCHG instructions rather than MOV instructions for std::memory_order_seq_cst stores, in order to provide the necessary ordering guarantees, whereas for std::memory_order_release a plain MOV will suffice. On systems with more relaxed memory architectures the overhead may be greater, since plain loads and stores have fewer guarantees.

Memory ordering is hard. I devoted almost an entire chapter to it in my book.

answered Sep 21 '22 16:09

Anthony Williams

Related questions
                            
                                What is lock-free multithreaded programming?
                            
                                How do I add a reference to an unmanaged C++ project called by a C# project?
                            
                                Generating a normal map from a height map?
                            
                                Branch-aware programming
                            
                                strptime() equivalent on Windows?
                            
                                Why does C++ need the scope resolution operator?
                            
                                Value and size of an uninitialized std::string variable in c++
                            
                                Why isn't it legal to convert "pointer to pointer to non-const" to a "pointer to pointer to const"
                            
                                C++, can I statically initialize a std::map at compile time?
                            
                                Why is 'char' signed by default in C++?
                            
                                Is there a sorted container in the STL?
                            
                                Using a variable with the same name in different spaces
                            
                                C++: Should I use 'typedef' or 'using namespace'? [closed]
                            
                                Preincrement faster than postincrement in C++ - true? If yes, why is it? [duplicate]
                            
                                How to use boost::crc?
                            
                                Evil samples of subtly broken C++ code
                            
                                Embed resources (eg, shader code; images) into executable/library with CMake
                            
                                Can I easily iterate over the values of a map using a range-based for loop?
                            
                                Where is a std::string allocated in memory?
                            
                                How to set the stacksize with C++11 std::thread

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Acquire/Release versus Sequentially Consistent memory order

Tags:

c++

memory-model

atomic

concurrency

stdatomic

zahir

People also ask

1 Answers

Anthony Williams

Recent Activity

Donate For Us