Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Undefined behaviour of operators in XOR swap algorithm?

void swap(int* a, int* b) {
    if (a != b)
        *a ^= *b ^= *a ^= *b;
}

As the above *a ^= *b ^= *a ^= *b is just a shortcut for *a = *a ^ (*b = *b ^ (*a = *a ^ *b)), could (e.g.) the 2nd *a be evaluated (for the XOR) just before the 3rd *a is modified (by the =)?

Does it matter whether I write it in C99/C11/C++98/C++11?

like image 798
Alexander Aleksandrovič Klimov Avatar asked Feb 28 '15 13:02

Alexander Aleksandrovič Klimov


People also ask

How does XOR swapping work?

In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required.

How does XOR work c++?

The ^ (bitwise XOR) in C or C++ takes two numbers as operands and does XOR on every bit of two numbers. The result of XOR is 1 if the two bits are different. The << (left shift) in C or C++ takes two numbers, left shifts the bits of the first operand, the second operand decides the number of places to shift.


Video Answer


1 Answers

The C++11 standard says:

5.17/1: The assignment operator (=) and the compound assignment operators all group right-to-left. (...) the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.

1.9/15: If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

So *a ^= *b is sequenced as follows :

  1. *a and *b are calculated. It's NOT determined in which order
  2. the xor operation is performed
  3. the assignment is done, i.e. the new value is stored in *a
  4. the new value is used as result of the expression (*a ^= *b)

Now *b ^= *a ^= *b, which according to priority rule is *b ^= (*a ^= *b) :

  1. *b and (*a ^= *b) are calculated. It's NOT determined in which order. But as *b is not modified by (*a ^= *b) it doesn't matter.
  2. the xor operation is performed
  3. the assignment is done, i.e. the new value is stored in *b

But now to the unspecified sequencing with *a ^= *b ^= *a ^= *b which is according to priority rules *a ^= (*b ^= (*a ^= *b) ):

  1. *a and (*b ^= (*a ^= *b) ) are calculated. It's NOT determined in which order. But as *a IS modified by (*b ^= (*a ^= *b) ). So the result will depend on which value is calculated first. That's clearly an U.B.

Suppose *a is evaluated first, (i.e. before anything else):
you would get the original value of it, which will be xored with the value of (*b ^= (*a ^= *b) ), that is the original *b xored with original *a xored again with *b. This will result in 0 (which will be stored in *a).

Suppose (*b ^= (*a ^= *b) ) is evaluated first, then its result is the original *a, but the content of *a is changed to the original *a xored with the original *b. Thus this will result in the original *b (which will be stored in *a)

By the way, in both cases, *b contains the original value of *a xored twice with *b meaning that *b will contain the original *a.

CONCLUSION: Here is demonstrated that the final value of *b is uniquely determined by this expression, but that the final value of *a is not uniquely defined (two values possible). So it's clearly an UNSPECIFIED/UNDEFINED RESULT ! It may swap or it might lose *a depending on your compiler.

How to make the swapping for sure ?

I've demonstrated above that first two compound assignments are well specified. So we just have to make sure that the last compound assignment is done after it. This can be guaranteed by a comma operator:

5.18/1: A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is discarded

Hence, the following would work:

void safe_swap(int* a, int* b) {
    if (a != b)
        *b ^= *a ^= *b, *a ^= *b;
}

EDIT: But why an XOR swapping ?

On some embedded device with no more free memory, one might be obliged in extreme conditions to use such advanced trick. But it has drawbacks.

First it is difficult to understand, and as seen above, error prone. Then it might not be as efficient as it seems. Some implementation dependent experiments show less optimal code: 3 MOV and 3 XOR, compared to only 4 MOV for the classical swap using a temporary variable. Some informal benchmarks suggest that it could be slower by 3 to 8% most of the time.

By the way, the classical swap can also be written in one statement:

void modern_swap(int*a, int*b) {
    if (a!=b) 
        tie(*a,*b)=make_pair(*b,*a);
} 
like image 172
Christophe Avatar answered Oct 16 '22 08:10

Christophe