<pre class="prettyprint"><code>#include <atomic> #include <cassert> #include <thread> std::atomic<bool> x = false, y = false, go = false; int v = 0; // t1 void write_xy() { while (!go) { std::this_thread::yield(); } v = 1; // 1 x.store(true, std::memory_order_relaxed); // 2 y.store(true, std::memory_order_relaxed); // 3 } // t2 void read_yx() { while (!go) { std::this_thread::yield(); } while (!y.load(std::memory_order_relaxed)) ; assert(1 == x.load(std::memory_order_relaxed)); // 4 assert(1 == v); // 5 } int main() { for (;;) { x = false; y = false; v = 0; go = false; std::thread t1(write_xy); std::thread t2(read_yx); go = true; // start t1.join(); t2.join(); } } </code></pre> As a beginner of C++ concurrent programming, according to my understanding of <code>memory_order_relaxed</code>, the execution order of the three statements in the <code>t1</code> thread in the above code is not visible to <code>t2</code>. From the perspective of <code>t2</code>, the three statements in <code>t1</code> may have the order of 3, 2, 1, so the assert at 4 and 5 may fire. After many attempts, <code>assert</code> never fired, so I wrote an endless loop repeating the above procedure, and <code>assert</code> still didn't fire. Later, it was suspected that <code>t1</code> ended before <code>t2</code> started execution, so <code>go</code> variable was introduced to wait at the beginning of two threads to ensure that both threads started execution as soon as possible, and <code>assert</code> was still not triggered. I test on my virtual machine with Centos8 and 4 CPUs. My CPU is i5-7500.

Just that the language allows for something to happen doesn't mean you'll be able to reproduce it in a given circumstance. Let's ignore the data race on <code>v</code> for now (even though it means your program has Undefined Behavior). You are compiling the code for x86, which has very strong guarantees about memory ordering built in. For example, you get the exact same assembly code when you perform the stores with <code>std::memory_order_release</code>: https://godbolt.org/z/pZaFDC <pre class="prettyprint"><code> mov DWORD PTR v[rip], 1 mov BYTE PTR x[rip], 1 mov BYTE PTR y[rip], 1 </code></pre> So this code (compiled for your CPU) is guaranteed to have both <code>v == 1</code> and <code>x == 1</code> visible to all other threads when <code>y == 1</code>. Your C++ program did not have this guarantee, but this machine code does. Similarly, using <code>std::memory_order_acquire</code> for the loads has no effect (only the text of the assert message changes): https://godbolt.org/z/e2-uNA <pre class="prettyprint"><code> movzx eax, BYTE PTR y[rip] [...] movzx eax, BYTE PTR x[rip] [...] cmp DWORD PTR v[rip], 1 </code></pre> Again, the platform provides the necessary guarantees already. Other platforms (e.g. ARM) provide fewer guarantees and you would see differences in the compiled binary: https://godbolt.org/z/Ru4YdD Here, synchronization is added to all the stores and reads: <pre class="prettyprint"><code> bl __sync_synchronize </code></pre> The above x86 code is also why the data race on <code>v</code> has no effect at this moment. However, relying on this is a terrible idea, as the compiler would be completely in its rights to e.g. move <code>assert(v == 1);</code> before the <code>while (!y.load(std::memory_order_relaxed))</code>. It just currently doesn't happen to do so. Another way to get the assert would be if the compiler reordered your loads and stores. It would be allowed to do so (whereas with release-acquire ordering as above it would not), but it doesn't, presumably because there's no point in that. You might be able to coax it into doing so by changing the surrounding code, but I can't come up with a way to to do that.

Understanding of c++ memory order，am I wrong?

Tags:

c++

multithreading

concurrency

#include <atomic>
#include <cassert>
#include <thread>

std::atomic<bool> x = false, y = false, go = false;
int v = 0;

// t1
void write_xy() {
  while (!go) {
    std::this_thread::yield();
  }

  v = 1;                                     // 1
  x.store(true, std::memory_order_relaxed);  // 2
  y.store(true, std::memory_order_relaxed);  // 3
}

// t2
void read_yx() {
  while (!go) {
    std::this_thread::yield();
  }

  while (!y.load(std::memory_order_relaxed))
    ;

  assert(1 == x.load(std::memory_order_relaxed));  // 4
  assert(1 == v);                                  // 5
}

int main() {
  for (;;) {
    x = false;
    y = false;
    v = 0;

    go = false;
    std::thread t1(write_xy);
    std::thread t2(read_yx);
    go = true;  // start
    t1.join();
    t2.join();
  }
}

As a beginner of C++ concurrent programming, according to my understanding of memory_order_relaxed, the execution order of the three statements in the t1 thread in the above code is not visible to t2. From the perspective of t2, the three statements in t1 may have the order of 3, 2, 1, so the assert at 4 and 5 may fire.

After many attempts, assert never fired, so I wrote an endless loop repeating the above procedure, and assert still didn't fire. Later, it was suspected that t1 ended before t2 started execution, so go variable was introduced to wait at the beginning of two threads to ensure that both threads started execution as soon as possible, and assert was still not triggered.

I test on my virtual machine with Centos8 and 4 CPUs. My CPU is i5-7500.

517

asked Nov 20 '19 12:11

honghui bi

Video Answer

1 Answers

Just that the language allows for something to happen doesn't mean you'll be able to reproduce it in a given circumstance.

Let's ignore the data race on v for now (even though it means your program has Undefined Behavior).

You are compiling the code for x86, which has very strong guarantees about memory ordering built in. For example, you get the exact same assembly code when you perform the stores with std::memory_order_release:

https://godbolt.org/z/pZaFDC

    mov     DWORD PTR v[rip], 1
    mov     BYTE PTR x[rip], 1
    mov     BYTE PTR y[rip], 1

So this code (compiled for your CPU) is guaranteed to have both v == 1 and x == 1 visible to all other threads when y == 1. Your C++ program did not have this guarantee, but this machine code does.

Similarly, using std::memory_order_acquire for the loads has no effect (only the text of the assert message changes):

https://godbolt.org/z/e2-uNA

    movzx   eax, BYTE PTR y[rip]
[...]
    movzx   eax, BYTE PTR x[rip]
[...]
    cmp     DWORD PTR v[rip], 1

Again, the platform provides the necessary guarantees already. Other platforms (e.g. ARM) provide fewer guarantees and you would see differences in the compiled binary:

https://godbolt.org/z/Ru4YdD

Here, synchronization is added to all the stores and reads:

    bl      __sync_synchronize

The above x86 code is also why the data race on v has no effect at this moment. However, relying on this is a terrible idea, as the compiler would be completely in its rights to e.g. move assert(v == 1); before the while (!y.load(std::memory_order_relaxed)). It just currently doesn't happen to do so.

Another way to get the assert would be if the compiler reordered your loads and stores. It would be allowed to do so (whereas with release-acquire ordering as above it would not), but it doesn't, presumably because there's no point in that. You might be able to coax it into doing so by changing the surrounding code, but I can't come up with a way to to do that.

answered Oct 01 '22 17:10

Max Langhof

Related questions
                            
                                Is it possible to wait for a transfer from the staging buffer to complete without calling vkQueueWaitIdle
                            
                                How to implement If-Else Conditional template?
                            
                                dynamic_cast vs dynamic library boundaries
                            
                                C++ partial template argument deduction for function with variadic pack produces ambiguous call in Clang and MSVC
                            
                                The fastest way to read csv file in c++ which contains large no of columns and rows
                            
                                __LINE__ is not constexpr in MSVC
                            
                                How to set a breakpoint in gdb for an anonymous namespace?
                            
                                Inheriting templated operator= in C++14: different behaviour with g++ and clang++
                            
                                Does C++ guarantee that the address of a Base subobject will be the same as the address of its Derived object in case of single inheritance?
                            
                                Why is no compile-time error when calling an ambiguous ctor?
                            
                                Calling clear on a vector immediately after construction?
                            
                                fatal error C1001: An internal error has occurred in the compiler. 'f:\dd\vctools\compiler\cxxfe\sl\p1\c\p0io.c'
                            
                                Modifying captured parameters in nested lambda: gcc vs clang?
                            
                                How to bind parameters recursively to a function?
                            
                                How do I make the system print 0x before the hex number & 0 before the octal number?
                            
                                compiler cares about copy constructor when it doesn't need one
                            
                                Nested template classes: Parameter default value not accepted
                            
                                Can I use pybind11 to pass a numpy array to a function accepting a Eigen::Tensor?
                            
                                correct (or safest )way of initializing void pointer with non-zero value?
                            
                                How to "fix" debugger in VScode if you have Makefile project on C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With