Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Volatile in C++11

In C++11 standard the machine model changed from a single thread machine to a multi threaded machine.

Does this mean that the typical static int x; void func() { x = 0; while (x == 0) {} } example of optimized out read will no longer happen in C++11?

EDIT: for those who don't know this example (I'm seriously astonished), please read this: https://en.wikipedia.org/wiki/Volatile_variable

EDIT2: OK, I was really expecting that everyone who knew what volatile is has seen this example.

If you use the code in the example the variable read in the cycle will be optimized out, making the cycle endless.

The solution of course is to use volatile which will force the compiler to read the variable on each access.

My question is if this is a deprecated problem in C++11, since the machine model is multi-threaded, therefore the compiler should consider concurrent access to variable to be present in the system.

like image 953
Šimon Tóth Avatar asked Oct 14 '12 00:10

Šimon Tóth


People also ask

What is volatile in C programming?

A volatile keyword in C is nothing but a qualifier that is used by the programmer when they declare a variable in source code. It is used to inform the compiler that the variable value can be changed any time without any task given by the source code. Volatile is usually applied to a variable when we are declaring it.

What is volatile variable in C with example?

The volatile qualifier is applied to a variable when we declare it. It is used to tell the compiler, that the value may change at any time. These are some properties of volatile. It cannot cache the variables in register.

What is volatile in embedded C?

C's volatile keyword is a qualifier that is applied to a variable when it is declared. It tells the compiler that the value of the variable may change at any time--without any action being taken by the code the compiler finds nearby. The implications of this are quite serious.

Why volatile is used in C?

The volatile keyword is intended to prevent the compiler from applying any optimizations on objects that can change in ways that cannot be determined by the compiler. Objects declared as volatile are omitted from optimization because their values can be changed by code outside the scope of current code at any time.


1 Answers

Whether it is optimized out depends entirely on compilers and what they choose to optimize away. The C++98/03 memory model does not recognize the possibility that x could change between the setting of it and the retrieval of the value.

The C++11 memory model does recognize that x could be changed. However, it doesn't care. Non-atomic access to variables (ie: not using std::atomics or proper mutexes) yields undefined behavior. So it's perfectly fine for a C++11 compiler to assume that x never changes between the write and reads, since undefined behavior can mean, "the function never sees x change ever."

Now, let's look at what C++11 says about volatile int x;. If you put that in there, and you have some other thread mess with x, you still have undefined behavior. Volatile does not affect threading behavior. C++11's memory model does not define reads or writes from/to x to be atomic, nor does it require the memory barriers needed for non-atomic reads/writes to be properly ordered. volatile has nothing to do with it one way or the other.

Oh, your code might work. But C++11 doesn't guarantee it.

What volatile tells the compiler is that it can't optimize memory reads from that variable. However, CPU cores have different caches, and most memory writes do not immediately go out to main memory. They get stored in that core's local cache, and may be written... eventually.

CPUs have ways to force cache lines out into memory and to synchronize memory access among different cores. These memory barriers allow two threads to communicate effectively. Merely reading from memory in one core that was written in another core isn't enough; the core that wrote the memory needs to issue a barrier, and the core that's reading it needs to have had that barrier complete before reading it to actually get the data.

volatile guarantees none of this. Volatile works with "hardware, mapped memory and stuff" because the hardware that writes that memory makes sure that the cache issue is taken care of. If CPU cores issued a memory barrier after every write, you can basically kiss any hope of performance goodbye. So C++11 has specific language saying when constructs are required to issue a barrier.

volatile is about memory access (when to read); threading is about memory integrity (what is actually stored there).

The C++11 memory model is specific about what operations will cause writes in one thread to become visible in another. It's about memory integrity, which is not something volatile handles. And memory integrity generally requires both threads to do something.

For example, if thread A locks a mutex, does a write, and then unlocks it, the C++11 memory model only requires that write to become visible to thread B if thread B later locks it. Until it actually acquires that particular lock, it's undefined what value is there. This stuff is laid out in great detail in section 1.10 of the standard.

Let's look at the code you cite, with respect to the standard. Section 1.10, p8 speaks of the ability of certain library calls to cause a thread to "synchronize with" another thread. Most of the other paragraphs explain how synchronization (and other things) build an order of operations between threads. Of course, your code doesn't invoke any of this. There is no synchronization point, no dependency ordering, nothing.

Without such protection, without some form of synchronization or ordering, 1.10 p21 comes in:

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

Your program contains two conflicting actions (reading from x and writing to x). Neither is atomic, and neither is ordered by synchronization to happen before the other.

Thus, you have achieved undefined behavior.

So the only case where you get guaranteed multithreaded behavior by the C++11 memory model is if you use a proper mutex or std::atomic<int> x with the proper atomic load/store calls.

Oh, and you don't need to make x volatile too. Anytime you call a (non-inline) function, that function or something it calls could modify a global variable. So it cannot optimize away the read of x in the while loop. And every C++11 mechanism to synchronize requires calling a function. That just so happens to invoke a memory barrier.

like image 63
Nicol Bolas Avatar answered Oct 21 '22 18:10

Nicol Bolas