Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is the behavior of i = post_increment_i() specified, unspecified, or undefined?

Consider the following C program:

int i = 0;

int post_increment_i() { return i++; }

int main() {
    i = post_increment_i();
    return i;
}

With respect to the 2011 version of the C standard (known as C11), which of the following alternatives is true:

  1. C11 guarantees that main returns 0.
  2. C11 guarantees that main returns either 0 or 1.
  3. The behavior of this program is undefined according to C11.

Relevant snippets from the C11 standard:

  • 5.1.2.3 Program execution

    Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object.

    Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced before B, then B is sequenced after A.) If A is not sequenced before or after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which.13 The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B. (A summary of the sequence points is given in annex C.)

    13) The executions of unsequenced evaluations can interleave. Indeterminately sequenced evaluations cannot interleave, but can be executed in any order.

  • 6.5 Expressions

    An expression is a sequence of operators and operands that specifies computation of a value, or that designates an object or a function, or that generates side effects, or that performs a combination thereof. The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.

    If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

  • 6.5.2.2 Function calls

    There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function.94

    94) In other words, function executions do not ‘‘interleave’’ with each other.

  • 6.5.2.4 Postfix increment and decrement operators

    The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it). [...] The value computation of the result is sequenced before the side effect of updating the stored value of the operand. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation.

  • 6.5.16 Assignments

    An assignment operator stores a value in the object designated by the left operand. [...] The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. The evaluations of the operands are unsequenced.

  • 6.8 Statements and blocks

    A full expression is an expression that is not part of another expression or of a declarator. Each of the following is a full expression: [...] the expression in an expression statement; [...] the (optional) expression in a return statement. There is a sequence point between the evaluation of a full expression and the evaluation of the next full expression to be evaluated.

The three alternatives above correspond to the following three cases, respectively:

  1. The side effect of the postfix increment operator is sequenced before the assignment in main.
  2. The side effect of the postfix increment operator is sequenced either before or after the assignment in main, and C11 does not specify which. (In other words, the two side effects are indeterminately sequenced.)
  3. The two side effects are unsequenced.

It seems that the first alternative holds, by the following chain of reasoning:

  • Consider the rule Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function. in 6.5.2.2. Assumption A: The side effect of the assignment operator in main is such an "evaluation". Assumption B: The phrase "the execution of the called function" includes both the value computation of the postfix increment operator and the side effect of the postfix increment operator. From these assumptions and the above rule, it follows that either I) the value computation and the side effect of the postfix increment operator are both sequenced before the side effect of the assignment operator in main, or II) the value computation and the side effect of the postfix increment operator are both sequenced after the side effect of the assignment operator in main.

  • Consider the rule The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands. This rule rules out case I above. It follows that case II holds. QED

Overall, this looks like a pretty strong argument. Also, it corresponds to what one would intuitively consider the most likely alternative.

However, it does rely on specific interpretations of the terms "evaluation" and "execution of the called function" (Assumptions A and B) and a not entirely straightforward line of reasoning, so I wanted to put it out there to see if people have reasons to believe that this interpretation is incorrect. Note that footnote 94 is equivalent with this interpretation only if it applies also in the sense that the caller does not interleave with the callee, which in turn implies that "interleave" means interleaving in the "abab" sense, since obviously a caller interleaves with the callee in the weaker "aba" sense. Also, alternatives 2 and 3 would seem plausible in a scenario where the compiler inlines the function and then performs the same kinds of optimizations that motivate why the expression i = i++ has undefined behavior.

like image 737
user1480833 Avatar asked Jun 25 '12 20:06

user1480833


1 Answers

[My answer is based on the simpler C99 standard, and the fact that it's extremely unlikely that C11 would introduce a breaking change:]

This behaviour of this code is well-defined: main returns 0. There is a sequence point immediately after the full expression in the return statement (see C99, Annex C), so the side-effects of i++ take effect before the assignment to i in main.

like image 89
Oliver Charlesworth Avatar answered Sep 28 '22 08:09

Oliver Charlesworth