Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between using explicit fences and std::atomic?

Assuming that aligned pointer loads and stores are naturally atomic on the target platform, what is the difference between this:

// Case 1: Dumb pointer, manual fence int* ptr; // ... std::atomic_thread_fence(std::memory_order_release); ptr = new int(-4); 

this:

// Case 2: atomic var, automatic fence std::atomic<int*> ptr; // ... ptr.store(new int(-4), std::memory_order_release); 

and this:

// Case 3: atomic var, manual fence std::atomic<int*> ptr; // ... std::atomic_thread_fence(std::memory_order_release); ptr.store(new int(-4), std::memory_order_relaxed); 

I was under the impression that they were all equivalent, however Relacy detects a data race in the first case (only):

struct test_relacy_behaviour : public rl::test_suite<test_relacy_behaviour, 2> {     rl::var<std::string*> ptr;     rl::var<int> data;      void before()     {         ptr($) = nullptr;         rl::atomic_thread_fence(rl::memory_order_seq_cst);     }      void thread(unsigned int id)     {         if (id == 0) {             std::string* p  = new std::string("Hello");             data($) = 42;             rl::atomic_thread_fence(rl::memory_order_release);             ptr($) = p;         }         else {             std::string* p2 = ptr($);        // <-- Test fails here after the first thread completely finishes executing (no contention)             rl::atomic_thread_fence(rl::memory_order_acquire);              RL_ASSERT(!p2 || *p2 == "Hello" && data($) == 42);         }     }      void after()     {         delete ptr($);     } }; 

I contacted the author of Relacy to find out if this was expected behaviour; he says that there is indeed a data race in my test case. However, I'm having trouble spotting it; can someone point out to me what the race is? Most importantly, what are the differences between these three cases?

Update: It's occurred to me that Relacy may simply be complaining about the atomicity (or lack thereof, rather) of the variable being accessed across threads... after all, it doesn't know that I intend only to use this code on platforms where aligned integer/pointer access is naturally atomic.

Another update: Jeff Preshing has written an excellent blog post explaining the difference between explicit fences and the built-in ones ("fences" vs "operations"). Cases 2 and 3 are apparently not equivalent! (In certain subtle circumstances, anyway.)

like image 298
Cameron Avatar asked Jan 05 '13 01:01

Cameron


2 Answers

I believe the code has a race. Case 1 and case 2 are not equivalent.

29.8 [atomics.fences]

-2- A release fence A synchronizes with an acquire fence B if there exist atomic operations X and Y, both operating on some atomic object M, such that A is sequenced before X, X modifies M, Y is sequenced before B, and Y reads the value written by X or a value written by any side effect in the hypothetical release sequence X would head if it were a release operation.

In case 1 your release fence does not synchronize with your acquire fence because ptr is not an atomic object and the store and load on ptr are not atomic operations.

Case 2 and case 3 are equivalent (actually, not quite, see LWimsey's comments and answer), because ptr is an atomic object and the store is an atomic operation. (Paragraphs 3 and 4 of [atomic.fences] describe how a fence synchronizes with an atomic operation and vice versa.)

The semantics of fences are defined only with respect to atomic objects and atomic operations. Whether your target platform and your implementation offer stronger guarantees (such as treating any pointer type as an atomic object) is implementation-defined at best.

N.B. for both of case 2 and case 3 the acquire operation on ptr could happen before the store, and so would read garbage from the uninitialized atomic<int*>. Simply using acquire and release operations (or fences) doesn't ensure that the store happens before the load, it only ensures that if the load reads the stored value then the code is correctly synchronized.

like image 105
Jonathan Wakely Avatar answered Sep 29 '22 09:09

Jonathan Wakely


Several pertinent references:

  • the C++11 draft standard (PDF, see clauses 1, 29 and 30);
  • Hans-J. Boehm's overview of concurrency in C++;
  • McKenney, Boehm and Crowl on concurrency in C++;
  • GCC's developmental notes on concurrency in C++;
  • the Linux kernel's notes on concurrency;
  • a related question with answers here on Stackoverflow;
  • another related question with answers;
  • Cppmem, a sandbox in which to experiment with concurrency;
  • Cppmem's help page;
  • Spin, a tool for analyzing the logical consistency of concurrent systems;
  • an overview of memory barriers from a hardware perspective (PDF).

Some of the above may interest you and other readers.

like image 41
thb Avatar answered Sep 29 '22 09:09

thb