Please see the following code valid in C and C++: <pre class="prettyprint"><code>extern int output; extern int input; extern int error_flag; void func(void) { if (0 != error_flag) { output = -1; } else { output = input; } } </code></pre> <ol> <li> Is the compiler allowed to compile the above code in the same way as if it looked like below? <pre class="prettyprint"><code>extern int output; extern int input; extern int error_flag; void func(void) { output = -1; if (0 == error_flag) { output = input; } } </code></pre> In other words, is the compiler allowed to generate (from the first snippet) code that always makes a temporary assignment of -1 to <code>output</code> and then assign <code>input</code> value to <code>output</code> depending on <code>error_flag</code> status? </li> <li>Would the compiler be allowed to do it if <code>output</code> would be declared as volatile?</li> <li>Would the compiler be allowed to do it if <code>output</code> would be declared as <code>atomic_int</code> (stdatomic.h)?</li> </ol> Update after David Schwartz's comment: If the compiler is free to add additional writes to a variable, it seems it is not possible to tell from the C code whether a data race exists or not. How to determine this?

<ol> <li>Yes, the speculative assignment is possible. Modification of a non-volatile variable is not part of the observable behaviour of the program and thus a spurious write is allowed. (See below for the definition of "observable behaviour", which does not actually include all behaviour which you might observe.)</li> <li>No. If <code>output</code> is <code>volatile</code>, speculative or spurious mutations are not permitted because the mutation is part of observable behaviour. (Writing to -- or reading from -- a hardware register may have consequences other than just storing a value. This is one of the primary use cases of <code>volatile</code>.)</li> <li>(Edited) No, the speculative assignment is not possible with <code>atomic</code> <code>output</code>. Loads and stores of <code>atomic</code> variables are synchronized operations, so it should not be possible to load a value of such a variable which was not explicitly stored into the variable.</li> </ol> <h3>Observable behaviour</h3> Although a program can do lots of obviously observable things (for example, abruptly terminating because of a segfault), the C and C++ standards only guarantee a limited set of results. Observable behaviour is defined in the C11 draft at §5.1.2.3p6 and in the current C++14 draft at §1.9p8 [intro.execution] with very similar wording: <blockquote> The least requirements on a conforming implementation are: — Access to volatile objects are evaluated strictly according to the rules of the abstract machine. — At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced. — The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined. These collectively are referred to as the observable behavior of the program. </blockquote> The above is taken from the C++ standard; the C standard differs in that in the second point it does not allow multiple possible results, and in the third point it explicitly references a relevant section of the standard library requirements. But details aside, the definitions are co-ordinated; for the purpose of this question, the relevant point is that only access to volatile variables is observable (up to the point that the value of a non-volatile variable is sent to an output device or file). <h3>Data Races</h3> This paragraph needs also to be read in the overall context of the C and C++ standards, which free the implementation from all requirements if the program engenders undefined behaviour. That's why the segfault is not considered in the definition of observable behaviour above: the segfault is a possible undefined behaviour but not a possible behaviour in a conformant program. So in the universe of only conformant programs and conformant implementations, there are no segfaults. That's important because a program with a data race is not conformant. A data race has undefined behaviour, even if it seems innocuous. And since it is the responsibility of the programmer to avoid undefined behaviour, the implementation may optimize without regard to data races. The exposition of the memory model in the C and C++ standards is dense and technical, and probably not suitable as an introduction to the concepts. (Browsing around the material on Hans Boehm's site will probably prove less difficult.) Extracting quotes from the standard is risky, because the details are important. But here is a small leap into the morass, from the current C++14 standard, §1.10 [intro.multithread]: <blockquote> <ol start="6"> <li>Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.</li> </ol> … <ol start="23"> <li> Two actions are potentially concurrent if — they are performed by different threads, or — they are unsequenced, and at least one is performed by a signal handler. The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior. </li> </ol> </blockquote> The take-away here is that a read and a write of the same variable need to be synchronized; otherwise it is a data race and the result is undefined behaviour. Some programmers might object to the strictness of this prohibition, arguing that some data races are "benign". This is the topic of Hans Boehm's 2011 HotPar paper "How to miscompile programs with "benign" data races" (pdf) (author's summary: "There are no benign data races"), and he explains it all much better than I could. Synchronization here includes the use of <code>atomic</code> types, so it is not a data race to concurrently read and modify an <code>atomic</code> variable. (The result of the read is unpredictable, but it must be either the value before the modification or the value afterwards.) This prevents the compiler from performing "piecemeal" modification of an atomic variable without some explicit synchronization. After some thought and more research, my conclusion is that the compiler cannot perform speculative writes to atomic variables either. Consequently, I modified the answer to question 3, which I had originally answered "no". Other useful references: <ul> <li> Bartosz Milewski: Dealing with Benign Data Races the C++ Way Milewski deals with the precise issue of speculative writes to atomic variables, and concludes: <blockquote> Can’t the compiler still do the same dirty trick, and momentarily store 42 in the <code>owner</code> variable? No, it can’t! Since the variable is declared <code>atomic</code> the compiler can no longer assume that the write can’t be observed by other threads. </blockquote> </li> <li> Herb Sutter on Thread Safety and Synchronization As usual, an accessible and well-written explanation. </li> </ul>

Yes, the compiler is allowed to do that kind of optimization. In general, you can assume that the compiler (and the CPU too) can reorder your code assuming it is running in a single thread. If you have more than one thread, you need to synchronize. If you don't synchronize and your code writes to a memory location that is written to or read by another thread, your code contains a data race, in C++ this is undefined behavior. <code>volatile</code> doesn't change the data race problem. However IIRC, the compiler is not allowed to reorder reads and writes to a <code>volatile</code> variable. When using <code>atomic_int</code>, the compiler can still perform certain optimizations. I don't think that the compiler can invent writes though (that could break a multithreaded program). However, it can still reorder operations, so be careful.

Compiler optimizations and temporary assignments in C and C++

Tags:

c++

c

embedded

concurrency

Please see the following code valid in C and C++:

extern int output;
extern int input;
extern int error_flag;

void func(void)
{
  if (0 != error_flag)
  {
    output = -1;
  }
  else
  {
    output = input;
  }
}

Is the compiler allowed to compile the above code in the same way as if it looked like below?
```
extern int output;
extern int input;
extern int error_flag;

void func(void)
{
  output = -1;
  if (0 == error_flag)
  {
    output = input;
  }
}
```
In other words, is the compiler allowed to generate (from the first snippet) code that always makes a temporary assignment of -1 to output and then assign input value to output depending on error_flag status?
Would the compiler be allowed to do it if output would be declared as volatile?
Would the compiler be allowed to do it if output would be declared as atomic_int (stdatomic.h)?

Update after David Schwartz's comment:

If the compiler is free to add additional writes to a variable, it seems it is not possible to tell from the C code whether a data race exists or not. How to determine this?

707

asked Jan 18 '16 20:01

mrn

2 Answers

Yes, the speculative assignment is possible. Modification of a non-volatile variable is not part of the observable behaviour of the program and thus a spurious write is allowed. (See below for the definition of "observable behaviour", which does not actually include all behaviour which you might observe.)
No. If output is volatile, speculative or spurious mutations are not permitted because the mutation is part of observable behaviour. (Writing to -- or reading from -- a hardware register may have consequences other than just storing a value. This is one of the primary use cases of volatile.)
(Edited) No, the speculative assignment is not possible with atomic output. Loads and stores of atomic variables are synchronized operations, so it should not be possible to load a value of such a variable which was not explicitly stored into the variable.

Observable behaviour

Although a program can do lots of obviously observable things (for example, abruptly terminating because of a segfault), the C and C++ standards only guarantee a limited set of results. Observable behaviour is defined in the C11 draft at §5.1.2.3p6 and in the current C++14 draft at §1.9p8 [intro.execution] with very similar wording:

The least requirements on a conforming implementation are:

— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

— The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

These collectively are referred to as the observable behavior of the program.

The above is taken from the C++ standard; the C standard differs in that in the second point it does not allow multiple possible results, and in the third point it explicitly references a relevant section of the standard library requirements. But details aside, the definitions are co-ordinated; for the purpose of this question, the relevant point is that only access to volatile variables is observable (up to the point that the value of a non-volatile variable is sent to an output device or file).

Data Races

This paragraph needs also to be read in the overall context of the C and C++ standards, which free the implementation from all requirements if the program engenders undefined behaviour. That's why the segfault is not considered in the definition of observable behaviour above: the segfault is a possible undefined behaviour but not a possible behaviour in a conformant program. So in the universe of only conformant programs and conformant implementations, there are no segfaults.

That's important because a program with a data race is not conformant. A data race has undefined behaviour, even if it seems innocuous. And since it is the responsibility of the programmer to avoid undefined behaviour, the implementation may optimize without regard to data races.

The exposition of the memory model in the C and C++ standards is dense and technical, and probably not suitable as an introduction to the concepts. (Browsing around the material on Hans Boehm's site will probably prove less difficult.) Extracting quotes from the standard is risky, because the details are important. But here is a small leap into the morass, from the current C++14 standard, §1.10 [intro.multithread]:

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

…

Two actions are potentially concurrent if

— they are performed by different threads, or

— they are unsequenced, and at least one is performed by a signal handler.

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.

The take-away here is that a read and a write of the same variable need to be synchronized; otherwise it is a data race and the result is undefined behaviour. Some programmers might object to the strictness of this prohibition, arguing that some data races are "benign". This is the topic of Hans Boehm's 2011 HotPar paper "How to miscompile programs with "benign" data races" (pdf) (author's summary: "There are no benign data races"), and he explains it all much better than I could.

Synchronization here includes the use of atomic types, so it is not a data race to concurrently read and modify an atomic variable. (The result of the read is unpredictable, but it must be either the value before the modification or the value afterwards.) This prevents the compiler from performing "piecemeal" modification of an atomic variable without some explicit synchronization.

After some thought and more research, my conclusion is that the compiler cannot perform speculative writes to atomic variables either. Consequently, I modified the answer to question 3, which I had originally answered "no".

Other useful references:

Bartosz Milewski: Dealing with Benign Data Races the C++ Way

Milewski deals with the precise issue of speculative writes to atomic variables, and concludes:

Can’t the compiler still do the same dirty trick, and momentarily store 42 in the owner variable? No, it can’t! Since the variable is declared atomic the compiler can no longer assume that the write can’t be observed by other threads.
Herb Sutter on Thread Safety and Synchronization

As usual, an accessible and well-written explanation.

answered Oct 29 '22 02:10

rici

Yes, the compiler is allowed to do that kind of optimization. In general, you can assume that the compiler (and the CPU too) can reorder your code assuming it is running in a single thread. If you have more than one thread, you need to synchronize. If you don't synchronize and your code writes to a memory location that is written to or read by another thread, your code contains a data race, in C++ this is undefined behavior.

volatile doesn't change the data race problem. However IIRC, the compiler is not allowed to reorder reads and writes to a volatile variable.

When using atomic_int, the compiler can still perform certain optimizations. I don't think that the compiler can invent writes though (that could break a multithreaded program). However, it can still reorder operations, so be careful.

answered Oct 29 '22 02:10

Florian

Related questions
                            
                                Complex dynamic_cast in c++
                            
                                how to turn on icc/icpc warnings?
                            
                                c++ template parameter type inference
                            
                                "The breakpoint will not currently be hit" - and I cannot add the .pdb file
                            
                                Google C++ Style Guide. Why input-then-output parameter ordering?
                            
                                C++11 vector of smart pointer
                            
                                C++ storing a value in an unordered pair
                            
                                How do I create a file with boost filesystem without opening it
                            
                                Run-time operators in C++
                            
                                How to switch between GCC and Clang in Clion from within CMakeLists.txt using windows/cygwin
                            
                                How to convert CvMat* to cv::Mat in OpenCV3.0
                            
                                Can std::this_thread::sleep_for() have spurious wakeups?
                            
                                What is "allocation context"?
                            
                                compiling caffe on Yosemite
                            
                                C++ non-type template parameter const char*
                            
                                Do unrestricted unions require placement new and a constructor definition?
                            
                                Does multiplying unsigned short cause undefined behaviour?
                            
                                Can you use a subexpression within fold expressions?
                            
                                What is the difference between Callable concept and the std::is_function type traits?
                            
                                SDL2 - Why does SDL_CreateTextureFromSurface() need a renderer*?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With