Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does modifying a field that is referenced by another variable lead to unexpected behavior?

I wrote this code that looked very simple to me at a first glance. It modifies a variable that is referenced by a reference variable and then returns the value of the reference. A simplified version that reproduces the odd behavior looks like this:

#include <iostream>
using std::cout;

struct A {
    int a;
    int& b;

    A(int x) : a(x), b(a) {}
    A(const A& other) : a(other.a), b(a) {}
    A() : a(0), b(a) {}
};

int foo(A a) {
    a.a *= a.b;
    return a.b;
}


int main() {
    A a(3);

    cout << foo(a) << '\n';
    return 0;
}

However, when it is compiled with optimization enabled (g++ 7.5), it produces output different to non-optimized code (i.e. 9 without optimizations - as expected, and 3 with optimizations enabled).

I am aware of the volatile keyword, which prevents the compiler from reordering and other optimizations in the presence of certain side-effects (e.g. async execution and hardware-specific stuff), and it helps in this case as well.

However, I do not understand why I need to declare reference b as volatile in this particular case? Where is the source of error in this code?

like image 219
Vadim Andronov Avatar asked Jul 11 '20 19:07

Vadim Andronov


1 Answers

I could not find a source of UB in regard of the standard. This looks to me like a bug of the optimizer that would fail to notice that a.b and a.a both refer to the same object:

  • First of all, foo() works on a copy. I changed foo() to pass by reference, and the expected result was consistently obtained. I suspected an issue in the initialization of the reference. But the provided copy constructor deals correctly with a.b.

  • Then I suspected some UB related to side effects of undeterminately sequenced operations in the same expression. But the side effect on the lhs of *= is sequenced after the rhs, so that there is no UB here either.

  • Adding some logging after the *= statement made it unexpectedly work as expected. This appeared very strange: it looks like the usual problems encountered when strict aliasing constraint is not respected, i.e. when the compiler doesn't realize that a pointed object was modified and otpimizes the code as if the value was unchanged. In such case, it's not unusual that additional code would cause the right value to be reloaded and find a different result.

  • There is however no aliasing issue here, since the original member and the reference to it both are both based on the same type.

When you have eliminated the impossible, whatever remains, however improbable, must be the truth.
- Sir Arthur Conan Doyle

After having eliminated bugs and UB in the OP code, the only remaining possibility is a bug in the optimizer. It seems that the optimizer fails to note that a.a and a.b are the same object, and that it simply reuses the latest known value of a.b which is already in a register.

like image 171
Christophe Avatar answered Nov 15 '22 03:11

Christophe