In C++11 standard the machine model changed from a single thread machine to a multi threaded machine. Does this mean that the typical <code>static int x; void func() { x = 0; while (x == 0) {} }</code> example of optimized out read will no longer happen in C++11? EDIT: for those who don't know this example (I'm seriously astonished), please read this: https://en.wikipedia.org/wiki/Volatile_variable EDIT2: OK, I was really expecting that everyone who knew what <code>volatile</code> is has seen this example. If you use the code in the example the variable read in the cycle will be optimized out, making the cycle endless. The solution of course is to use <code>volatile</code> which will force the compiler to read the variable on each access. My question is if this is a deprecated problem in C++11, since the machine model is multi-threaded, therefore the compiler should consider concurrent access to variable to be present in the system.

Whether it is optimized out depends entirely on compilers and what they choose to optimize away. The C++98/03 memory model does not recognize the possibility that <code>x</code> could change between the setting of it and the retrieval of the value. The C++11 memory model does recognize that <code>x</code> could be changed. However, it doesn't care. Non-atomic access to variables (ie: not using <code>std::atomic</code>s or proper mutexes) yields undefined behavior. So it's perfectly fine for a C++11 compiler to assume that <code>x</code> never changes between the write and reads, since undefined behavior can mean, "the function never sees <code>x</code> change ever." Now, let's look at what C++11 says about <code>volatile int x;</code>. If you put that in there, and you have some other thread mess with <code>x</code>, you still have undefined behavior. Volatile does not affect threading behavior. C++11's memory model does not define reads or writes from/to <code>x</code> to be atomic, nor does it require the memory barriers needed for non-atomic reads/writes to be properly ordered. <code>volatile</code> has nothing to do with it one way or the other. Oh, your code might work. But C++11 doesn't guarantee it. What <code>volatile</code> tells the compiler is that it can't optimize memory reads from that variable. However, CPU cores have different caches, and most memory writes do not immediately go out to main memory. They get stored in that core's local cache, and may be written... eventually. CPUs have ways to force cache lines out into memory and to synchronize memory access among different cores. These memory barriers allow two threads to communicate effectively. Merely reading from memory in one core that was written in another core isn't enough; the core that wrote the memory needs to issue a barrier, and the core that's reading it needs to have had that barrier complete before reading it to actually get the data. <code>volatile</code> guarantees none of this. Volatile works with "hardware, mapped memory and stuff" because the hardware that writes that memory makes sure that the cache issue is taken care of. If CPU cores issued a memory barrier after every write, you can basically kiss any hope of performance goodbye. So C++11 has specific language saying when constructs are required to issue a barrier. <code>volatile</code> is about memory access (when to read); threading is about memory integrity (what is actually stored there). The C++11 memory model is specific about what operations will cause writes in one thread to become visible in another. It's about memory integrity, which is not something <code>volatile</code> handles. And memory integrity generally requires both threads to do something. For example, if thread A locks a mutex, does a write, and then unlocks it, the C++11 memory model only requires that write to become visible to thread B if thread B later locks it. Until it actually acquires that particular lock, it's undefined what value is there. This stuff is laid out in great detail in section 1.10 of the standard. Let's look at the code you cite, with respect to the standard. Section 1.10, p8 speaks of the ability of certain library calls to cause a thread to "synchronize with" another thread. Most of the other paragraphs explain how synchronization (and other things) build an order of operations between threads. Of course, your code doesn't invoke any of this. There is no synchronization point, no dependency ordering, nothing. Without such protection, without some form of synchronization or ordering, 1.10 p21 comes in: <blockquote> The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. </blockquote> Your program contains two conflicting actions (reading from <code>x</code> and writing to <code>x</code>). Neither is atomic, and neither is ordered by synchronization to happen before the other. Thus, you have achieved undefined behavior. So the only case where you get guaranteed multithreaded behavior by the C++11 memory model is if you use a proper mutex or <code>std::atomic<int> x</code> with the proper atomic load/store calls. Oh, and you don't need to make <code>x</code> volatile too. Anytime you call a (non-inline) function, that function or something it calls could modify a global variable. So it cannot optimize away the read of <code>x</code> in the <code>while</code> loop. And every C++11 mechanism to synchronize requires calling a function. That just so happens to invoke a memory barrier.

Volatile in C++11

1 Answers

Whether it is optimized out depends entirely on compilers and what they choose to optimize away. The C++98/03 memory model does not recognize the possibility that x could change between the setting of it and the retrieval of the value.

The C++11 memory model does recognize that x could be changed. However, it doesn't care. Non-atomic access to variables (ie: not using std::atomics or proper mutexes) yields undefined behavior. So it's perfectly fine for a C++11 compiler to assume that x never changes between the write and reads, since undefined behavior can mean, "the function never sees x change ever."

Now, let's look at what C++11 says about volatile int x;. If you put that in there, and you have some other thread mess with x, you still have undefined behavior. Volatile does not affect threading behavior. C++11's memory model does not define reads or writes from/to x to be atomic, nor does it require the memory barriers needed for non-atomic reads/writes to be properly ordered. volatile has nothing to do with it one way or the other.

Oh, your code might work. But C++11 doesn't guarantee it.

What volatile tells the compiler is that it can't optimize memory reads from that variable. However, CPU cores have different caches, and most memory writes do not immediately go out to main memory. They get stored in that core's local cache, and may be written... eventually.

CPUs have ways to force cache lines out into memory and to synchronize memory access among different cores. These memory barriers allow two threads to communicate effectively. Merely reading from memory in one core that was written in another core isn't enough; the core that wrote the memory needs to issue a barrier, and the core that's reading it needs to have had that barrier complete before reading it to actually get the data.

volatile guarantees none of this. Volatile works with "hardware, mapped memory and stuff" because the hardware that writes that memory makes sure that the cache issue is taken care of. If CPU cores issued a memory barrier after every write, you can basically kiss any hope of performance goodbye. So C++11 has specific language saying when constructs are required to issue a barrier.

volatile is about memory access (when to read); threading is about memory integrity (what is actually stored there).

The C++11 memory model is specific about what operations will cause writes in one thread to become visible in another. It's about memory integrity, which is not something volatile handles. And memory integrity generally requires both threads to do something.

For example, if thread A locks a mutex, does a write, and then unlocks it, the C++11 memory model only requires that write to become visible to thread B if thread B later locks it. Until it actually acquires that particular lock, it's undefined what value is there. This stuff is laid out in great detail in section 1.10 of the standard.

Let's look at the code you cite, with respect to the standard. Section 1.10, p8 speaks of the ability of certain library calls to cause a thread to "synchronize with" another thread. Most of the other paragraphs explain how synchronization (and other things) build an order of operations between threads. Of course, your code doesn't invoke any of this. There is no synchronization point, no dependency ordering, nothing.

Without such protection, without some form of synchronization or ordering, 1.10 p21 comes in:

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

Your program contains two conflicting actions (reading from x and writing to x). Neither is atomic, and neither is ordered by synchronization to happen before the other.

Thus, you have achieved undefined behavior.

So the only case where you get guaranteed multithreaded behavior by the C++11 memory model is if you use a proper mutex or std::atomic<int> x with the proper atomic load/store calls.

Oh, and you don't need to make x volatile too. Anytime you call a (non-inline) function, that function or something it calls could modify a global variable. So it cannot optimize away the read of x in the while loop. And every C++11 mechanism to synchronize requires calling a function. That just so happens to invoke a memory barrier.

answered Oct 21 '22 18:10

Nicol Bolas

Related questions
                            
                                Using SSE instructions
                            
                                Why does everybody use unanchored namespace declarations (i.e. std:: not ::std::)?
                            
                                c++ how to remove filename from path string
                            
                                What is the difference between set and hashset in C++ STL?
                            
                                Kernel development and C++ [closed]
                            
                                'cmake' is not recognised as an internal or external command
                            
                                What does "use -D_SCL_SECURE_NO_WARNINGS" mean?
                            
                                Why is the word `false` written blue and the word `FALSE` written purple in Visual Studio?
                            
                                Relation between object file and shared object file
                            
                                Should I use _T or _TEXT on C++ string literals?
                            
                                ex.what() changes in nested try-catch
                            
                                std::map, pointer to map key value, is this possible?
                            
                                How to know if a pointer points to the heap or the stack?
                            
                                Constructors with default parameters in Header files
                            
                                Add my own compiler warning
                            
                                What is the difference between: Handle, Pointer and Reference
                            
                                Performance difference between ++iterator and iterator++?
                            
                                How could I create a list in c++?
                            
                                Replace multiple spaces with one space in a string
                            
                                Renaming namespaces

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Volatile in C++11

Tags:

c++

c++11

volatile

Šimon Tóth

People also ask

1 Answers

Nicol Bolas

Recent Activity

Donate For Us