When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.
According to the C standards, signed integer overflow is undefined behaviour too. A few compilers may trap the overflow condition when compiled with some trap handling options, while a few compilers simply ignore the overflow conditions (assuming that the overflow will never happen) and generate the code accordingly.
C has no specific undefined value. A function that wants to return an undefined value might indicate failure. Sometimes -1 is failure, sometimes 0 is failure, sometimes 0 is success; one has to look up the documentation to know exactly which. For a pointer, the undefined value is often pointer 0, the NULL pointer.
In C/C++ bitwise shifting a value by a number of bits which is either a negative number or is greater than or equal to the total number of bits in this value results in undefined behavior.
For Code 1, because the order of evaluation of the terms in return *a + f(a, b);
(and in return f(a, b) + *a;
) is not specified by the standard and the function modifies the value that a
is pointing at, your code has unspecified behaviour and various answers are possible.
As you can tell from the furor in the comments, the terms 'undefined behaviour', 'unspecified behaviour' and so on have technical meanings in the C standard, and earlier versions of this answer misused 'undefined behaviour' where it should have used 'unspecified'.
The title of the question is "Is this undefined behaviour in C?", and the answer is "No; it is unspecified behaviour, not undefined behaviour".
For Code 2 as fixed, the function also has unspecified behaviour: the value of the static variable r
is changed by the recursive call, so changes to the evaluation order could change the result.
For Code 2, as originally shown with int f(static int n) { … }
, the code does not (or, at least, should not) compile. The only storage class permitted in the definition of an argument to a function is register
, so the presence of static
should be giving you compilation errors.
ISO/IEC 9899:2011 §6.7.6.3 Function declarators (including prototypes) ¶2 The only storage-class specifier that shall occur in a parameter declaration is
register
.
Compiling with GCC 6.3.0 on macOS Sierra 10.12.2, like this (note, no extra warnings requested):
$ gcc -O ub17.c -o ub17
ub17.c:3:27: error: storage class specified for parameter ‘n’
int foo(static int n)
^
No; it doesn't compile at all as shown — at least, not for me using a modern version of GCC.
However, assuming that is fixed, the function also has undefined unspecified behaviour: the value of the static variable r
is changed by the recursive call, so changes to the evaluation order could change the result.
C standard states that
There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call. Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced1 with respect to the execution of the called function.94)
And foot note 86 (section 6.5/3) says:
In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.
In expressions return f(a,b) + *a;
and return *a + f(a,b);
evaluation of the subexpression *a
is indeterminately sequenced. In this case different results can be seen for the same program.
Note that the side effect on a
is sequenced in above expressions but it is unspecified in which order.
1. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which. (C11- 5.1.2.3/3)
I will focus on the definition of the first example.
The first example is defined with unspecified behavior. This means that there are multiple possible results, but the behavior is not undefined. (And the if the code can handle those results, the behavior is defined.)
A trivial example of unspecified, behavior is:
int a = 0;
int c = a + a;
It is unspecified whether left a or right a is evaluated first, as they are unsequenced. The +
operator doesn't specify any sequence points1. There are two possible orderings, either left a is evaluated first and then right a, or vice-versa. Since neither side is modified2, the behavior is defined.
Had left a or right a been modified without a sequence point, i.e. unsequenced, the behavior would be undefined2:
int a = 0;
int c = ++a + a;
Had left a or right a been modified with a sequence point in between, then the left and the right side would be indeterminately sequenced3. This means that they are sequenced, but it is unspecified which one is evaluated first. The behavior would be defined. Mind that comma operator introduces a sequence point4:
int a = 0;
int c = a + ((void)0,++a,0);
There are two possible orderings.
If left side is evaluated first, then a evaluates to 0. Then the right side is evaluated. First (void)0 is evaluated followed by a sequence point. Then a is incremented, followed by a sequence point. Then 0 is evaluated as 0 and is added to the left side. The result is 0.
If the right side is evaluated first, (void)0 is evaluated followed by a sequence point. Then a is incremented, followed by a sequence point. Then 0 is evaluated as 0. Then the left side is evaluated, and a evaluates to 1. The result is 1.
You example falls into the latter category, as the operands are indeterminately sequenced. The function call serves the same purpose5 as the comma operators in the above example. Your example is complicated, so I will use mine, which also applies to yours. The only difference is that there are many more possible results in your example that in mine, but the reasoning is the same:
void Function( int* a)
{
++(*a);
return 0;
}
int a = 0;
int c = a + Function( &a );
assert( c == 0 || c == 1 );
There are two possible orderings.
If the left side is evaluated first, a evaluates to 0. Then the right side is evaluated, there is a sequence point and the function is called. Then a is incremented, followed by another sequence point introduced by the end of the full expression6, the end of which is indicated by the semicolon. Then 0 is returned and added to 0. The result is 0.
If the right side is evaluated first, there is a sequence point and the function is called. Then a is incremented, followed by another sequence point introduced by the end of the full expression. Then 0 is returned. Then the left side is evaluated, and a evaluates to 1 and is added to 0. The result is 1.
(Quoted from: ISO/IEC 9899:201x)
1 (6.5 Expressions 3)
Except as specified
later, side effects and value computations of subexpressions are unsequenced.
2 (6.5 Expressions 2)
If a side effect on a scalar object is unsequenced relative to either a different side effect
on the same scalar object or a value computation using the value of the same scalar
object, the behavior is undefined.
3 (5.1.2.3 Program execution)
Evaluations A and B are indeterminately sequenced when A is sequenced
either before or after B, but it is unspecified which.
4 (6.5.17 Comma operator 2)
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand.
5 (6.5.2.2 Function calls 10)
There is a sequence point after the evaluations of the function designator and the actual
arguments but before the actual call.
6 (6.8 Statements and blocks 4)
There is a sequence point between the evaluation of a full expression and the
evaluation of the next full expression to be evaluated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With