Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it undefined behavior to use functions with side effects in an unspecified order?

I know that things like x = x++ + ++x invokes undefined behavior because a variable is modified multiple times within the same sequence point. That's thoroughly explained in this post Why are these constructs using pre and post-increment undefined behavior?

But consider a thing like printf("foo") + printf("bar"). The function printf returns an int, so the expression is valid in that sense. But the order of evaluation for the + operator is not specified in the standard, so it is not clear if this will print foobar or barfoo.

But my question here is if this also is undefined behavior.

like image 499
klutt Avatar asked Aug 30 '20 09:08

klutt


People also ask

What is undefined behavior in programming?

In computer programming, undefined behaviour is defined as 'the result of compiling computer code which is not prescribed by the specs of the programming language in which it is written'. This article will help you understand this behaviour with the help of a few case studies.

What type of behavior C is undefined?

In C the use of any automatic variable before it has been initialized yields undefined behavior, as does integer division by zero, signed integer overflow, indexing an array outside of its defined bounds (see buffer overflow), or null pointer dereferencing.

What causes undefined Behaviour in C?

So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.


1 Answers

printf("foo") + printf("bar") does not have undefined behavior (except for the caveat noted below) because the function calls are indeterminately sequenced and are not unsequenced.

C effectively has three possibilities for sequencing:

  • Two things, A and B, may be sequenced in a particular order, one of A before B or B before A.
  • Two things may be indeterminately sequenced, so that A is sequenced before B or vice-versa, but it is unspecified which.
  • Two things are unsequenced.

To distinguish between the latter two, suppose writing to stdout requires putting bytes in a buffer and updating the counter of how many bytes are in the buffer. (For this, we will neglect what happens when the buffer is full or should be sent to the output device.) Consider two writes to stdout, called A and B.

If A and B are indeterminately sequenced, then either one can go first, but both of its parts—writing the bytes and updating the counter—must be completed before the other one starts. If A and B are unsequenced, then nothing controls the parts; we might have: A puts its bytes in the buffer, B puts its bytes in the buffer, A updates the counter, B updates the counter.

In the former case, both writes are completed, but they can be completed in either order. In the latter case, the behavior is undefined. One of the possibilities is that B writes its bytes in the same place in the buffer as A’s bytes, losing A's bytes, because the counter was not updated to tell B where its new bytes should go.

In printf("foo") + printf("bar"), the writes to stdout are indeterminately sequenced. This is because the function calls provide sequence points that separate the side effects, but we do not know in which order they are evaluated.

C 2018 6.5.2.2 10 tells us that function calls introduce sequence points:

There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.

Thus, if the C implementation happens to evaluate printf("foo") second, there is a sequence point just before the actual call, and the evaluation of printf("bar") must have been sequenced before this. Conversely, if the implementation evaluates printf("bar") first, then printf("foo") must have been sequenced before it. So, there is sequencing, albeit indeterminate.

Additionally, 7.1.4 3 tells us:

There is a sequence point immediately before a library function returns.

Therefore, the two function calls are indeterminately sequenced. The rule in 6.5 2 about unsequenced side effects does not apply:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined…

(Not to mention the fact that stdout is not a scalar object.)

Caveat

There is a hazard that the C standard permits standard library functions to be implemented as function-like macros (C 2018 7.1.4 1). In this case, the reasoning above about sequence points might not apply. A program can force function calls by enclosing the name in parentheses so that it will not be treated as an invocation of a function-like macro: (printf)("foo") + (printf)("bar").

like image 97
Eric Postpischil Avatar answered Sep 29 '22 15:09

Eric Postpischil