Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why isn't a C++11 acquire_release fence enough for Dekker synchronization?

The failure of Dekker-style synchronization is typically explained with reordering of instructions. I.e., if we write

atomic_int X;
atomic_int Y;
int r1, r2;
static void t1() { 
    X.store(1, std::memory_order_relaxed)
    r1 = Y.load(std::memory_order_relaxed);
}
static void t2() {
    Y.store(1, std::memory_order_relaxed)
    r2 = X.load(std::memory_order_relaxed);
}

Then the loads can be reordered with the stores, leading to r1==r2==0.

I was expecting an acquire_release fence to prevent this kind of reordering:

static void t1() {
    X.store(1, std::memory_order_relaxed);
    atomic_thread_fence(std::memory_order_acq_rel);
    r1 = Y.load(std::memory_order_relaxed);
}
static void t2() {
    Y.store(1, std::memory_order_relaxed);
    atomic_thread_fence(std::memory_order_acq_rel);
    r2 = X.load(std::memory_order_relaxed);
}

The load cannot be moved above the fence and the store cannot be moved below the fence, and so the bad result should be prevented.

However, experiments show r1==r2==0 can still occur. Is there a reordering-based explanation for this? Where's the flaw in my reasoning?

like image 663
Jason Ptacek Avatar asked Dec 02 '14 11:12

Jason Ptacek


People also ask

Do C++11 portable memory fences synchronize with each other?

And finally, ifthe relaxed atomic load reads the value written by the relaxed atomic store, the C++11 standard says that the fences synchronize-witheach other, just as I’ve shown. I like C++11’s approach to portable memory fences.

What is the use of acquire and release fences?

The most important thing to know about acquire and release fences is that they can establish a synchronizes-withrelationship, which means that they prohibit memory reordering in a way that allows you to pass information reliably between threads.

Do I need to acquire and release fences in Java?

First things first: Acquire and release fences are considered low-levellock-free operations. If you stick with higher-level, sequentially consistentatomic types, such as volatilevariables in Java 5+, or default atomics in C++11, you don’t need acquire and release fences.

How to implement an acquire fence in sparc-v9?

On the SPARC-V9 architecture, an acquire fence can be implemented using the membar #LoadLoad | #LoadStoreinstruction, and an a release fence can be implemented as membar #LoadStore | #StoreStore.


1 Answers

As I understand it (mainly from reading Jeff Preshings blog), an atomic_thread_fence(std::memory_order_acq_rel) prevents any reorderings except for StoreLoad, i.e., it still allows to reorder a Store with a subsequent Load. However, this is exactly the reordering that has to be prevented in your example.

More precisely, an atomic_thread_fence(std::memory_order_acquire) prevents the reordering of any previous Load with any subsequent Store and any subsequent Load, i.e., it prevents LoadLoad and LoadStore reorderings across the fence.

An atomic_thread_fence(std::memory_order_release) prevents the reordering of any subsequent Store with any preceding Store and any preceding Load, i.e., it prevents LoadStore and StoreStore reorderings across the fence.

An atomic_thread_fence(std::memory_order_acq_rel) then prevents the union, i.e., it prevents LoadLoad, LoadStore, and StoreStore, which means that only StoreLoad may still happen.

like image 172
Toby Brull Avatar answered Oct 02 '22 01:10

Toby Brull