Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding of c++ memory order,am I wrong?

#include <atomic>
#include <cassert>
#include <thread>

std::atomic<bool> x = false, y = false, go = false;
int v = 0;

// t1
void write_xy() {
  while (!go) {
    std::this_thread::yield();
  }

  v = 1;                                     // 1
  x.store(true, std::memory_order_relaxed);  // 2
  y.store(true, std::memory_order_relaxed);  // 3
}

// t2
void read_yx() {
  while (!go) {
    std::this_thread::yield();
  }

  while (!y.load(std::memory_order_relaxed))
    ;

  assert(1 == x.load(std::memory_order_relaxed));  // 4
  assert(1 == v);                                  // 5
}

int main() {
  for (;;) {
    x = false;
    y = false;
    v = 0;

    go = false;
    std::thread t1(write_xy);
    std::thread t2(read_yx);
    go = true;  // start
    t1.join();
    t2.join();
  }
}

As a beginner of C++ concurrent programming, according to my understanding of memory_order_relaxed, the execution order of the three statements in the t1 thread in the above code is not visible to t2. From the perspective of t2, the three statements in t1 may have the order of 3, 2, 1, so the assert at 4 and 5 may fire.

After many attempts, assert never fired, so I wrote an endless loop repeating the above procedure, and assert still didn't fire. Later, it was suspected that t1 ended before t2 started execution, so go variable was introduced to wait at the beginning of two threads to ensure that both threads started execution as soon as possible, and assert was still not triggered.

I test on my virtual machine with Centos8 and 4 CPUs. My CPU is i5-7500.

like image 517
honghui bi Avatar asked Nov 20 '19 12:11

honghui bi


People also ask

How is code stored in memory?

For example, on a PC, your code could be loaded from the hard drive into RAM and executed in RAM. Similarly with Flash, your code could be loaded from Flash into RAM and executed in RAM. Constants, like numbers, can be placed into a Read-Only segment or in the Code Segment.

How does C memory work?

In C, dynamic memory is allocated from the heap using some standard library functions. The two key dynamic memory functions are malloc() and free(). The malloc() function takes a single parameter, which is the size of the requested memory area in bytes. It returns a pointer to the allocated memory.

How does the stack work C?

Stack, where automatic variables are stored, along with information that is saved each time a function is called. Each time a function is called, the address of where to return to and certain information about the caller's environment, such as some of the machine registers, are saved on the stack.

What is memory order in C++?

(since C++20) std::memory_order specifies how memory accesses, including regular, non-atomic memory accesses, are to be ordered around an atomic operation.


Video Answer


1 Answers

Just that the language allows for something to happen doesn't mean you'll be able to reproduce it in a given circumstance.

Let's ignore the data race on v for now (even though it means your program has Undefined Behavior).

You are compiling the code for x86, which has very strong guarantees about memory ordering built in. For example, you get the exact same assembly code when you perform the stores with std::memory_order_release:

https://godbolt.org/z/pZaFDC

    mov     DWORD PTR v[rip], 1
    mov     BYTE PTR x[rip], 1
    mov     BYTE PTR y[rip], 1

So this code (compiled for your CPU) is guaranteed to have both v == 1 and x == 1 visible to all other threads when y == 1. Your C++ program did not have this guarantee, but this machine code does.

Similarly, using std::memory_order_acquire for the loads has no effect (only the text of the assert message changes):

https://godbolt.org/z/e2-uNA

    movzx   eax, BYTE PTR y[rip]
[...]
    movzx   eax, BYTE PTR x[rip]
[...]
    cmp     DWORD PTR v[rip], 1

Again, the platform provides the necessary guarantees already. Other platforms (e.g. ARM) provide fewer guarantees and you would see differences in the compiled binary:

https://godbolt.org/z/Ru4YdD

Here, synchronization is added to all the stores and reads:

    bl      __sync_synchronize

The above x86 code is also why the data race on v has no effect at this moment. However, relying on this is a terrible idea, as the compiler would be completely in its rights to e.g. move assert(v == 1); before the while (!y.load(std::memory_order_relaxed)). It just currently doesn't happen to do so.

Another way to get the assert would be if the compiler reordered your loads and stores. It would be allowed to do so (whereas with release-acquire ordering as above it would not), but it doesn't, presumably because there's no point in that. You might be able to coax it into doing so by changing the surrounding code, but I can't come up with a way to to do that.

like image 58
Max Langhof Avatar answered Oct 01 '22 17:10

Max Langhof