The allegedly "clever" (but actually inefficient) way of swapping two integer variables, instead of using temporary storage, often involves this line:
int a = 10;
int b = 42;
a ^= b ^= a ^= b; /*Here*/
printf("a=%d, b=%d\n", a, b);
But I'm wondering, compound assignment operators like ^=
are not sequence points, are they?
Does this mean it's actually undefined behavior?
The C language defines the following sequence points: Left operand of the logical-AND operator (&&). The left operand of the logical-AND operator is completely evaluated and all side effects complete before continuing. If the left operand evaluates to false (0), the other operand is not evaluated.
In C, sequence points occur in the following places. Between evaluation of the left and right operands of the && (logical AND), || (logical OR) (as part of short-circuit evaluation), and comma operators.
a ^= b ^= a ^= b; /*Here*/
It is undefined behavior.
You are modifying an object (a
) more than once between two sequence points.
(C99, 6.5p2) "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.
Simple assignments as well as compound assignments don't introduce a sequence point. Here there is a sequence point before the expression statement expression and after the expression statement.
Sequence points are listed in Annex C (informative) of the c99 and c11 Standard.
^= are not sequence points, are they
They are not.
Does this mean it's actually undefined behavior?
Yes it is. Don't use this "clever" technique.
There are no sequence points in that expression, so it produces undefined behavior.
You could fix it trivially and retain most of the succinctness by using the comma operator, which does introduce sequence points:
a ^= b, b ^= a, a ^= b;
The order of the evaluation of the ^=
operators is well defined. What is not well defined is the order in which a
and b
are modified.
a ^= b ^= a ^= b;
is equivalent to
a ^= (b ^= (a ^= b));
An operator cannot be evaluated before its arguments are evaluated, so it is definitely going to execute a ^= b
first.
The reason to have this be undefined behavior is that, to give the compiler more flexibility in doing optimizations, it is allowed to modify the variable values in any order it chooses. It could choose to do this:
int a1 = a ^ b;
int b1 = b ^ a1;
int a2 = a ^ b1;
a = a1;
a = a2;
b = b1;
or this:
int a1 = a ^ b;
int b1 = b ^ a1;
a = a1;
int a2 = a ^ b1;
a = a2;
b = b1;
or even this:
int a1 = a ^ b;
int b1 = b ^ a1;
int a2 = a ^ b1;
a = a2;
a = a1;
b = b1;
If the compiler could only choose one of those three ways to do things, this would just be "unspecified" behavior. However, the standard goes further and makes this be "undefined" behavior, which basically allows the compiler to assume that it can't even happen.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With