Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compiler deduction of rvalue-references for variables going out of scope

Why won't the compiler automatically deduce that a variable is about to go out of scope, and therefore let it be considered an rvalue-reference?

Take for example this code:

#include <string>

int foo(std::string && bob);
int foo(const std::string & bob);

int main()
{
    std::string bob("  ");
    return foo(bob);
}

Inspecting the assembly code clearly shows that the const & version of "foo" is called at the end of the function.

Compiler Explorer link here: https://godbolt.org/g/mVi9y6

Edit: To clarify, I'm not looking for suggestions for alternative ways to move the variable. Nor am I trying to understand why the compiler chooses the const& version of foo. Those are things that I understand fine.

I'm interested in knowing of a counter example where the compiler converting the last usage of a variable before it goes out of scope into an rvalue-reference would introduce a serious bug into the resulting code. I'm unable to think of code that breaks if a compiler implements this "optimization".

If there's no code that breaks when the compiler automatically makes the last usage of a variable about to go out of scope an rvalue-reference, then why wouldn't compilers implement that as an optimization?

My assumption is that there is some code that would break where compilers to implement that "optimization", and I'd like to know what that code looks like.

The code that I detail above is an example of code that I believe would benefit from an optimization like this.

The order of evaluation for function arguments, such as operator+(foo(bob), foo(bob)) is implementation defined. As such, code such as

return foo(bob) + foo(std::move(bob));

is dangerous, because the compiler that you're using may evaluate the right hand side of the + operator first. That would result in the string bob potentially being moved from, and leaving it in a valid, but indeterminate state. Subsequently, foo(bob) would be called with the resulting, modified string.

On another implementation, the non-move version might be evaluated first, and the code would behave the way a non-expert would expect.

If we make the assumption that some future version of the c++ standard implements an optimization that allows for the compiler to treat the last usage of a variable as an rvalue reference, then

return foo(bob) + foo(bob);

would work with no surprises (assuming appropriate implementations of foo, anyway).

Such a compiler, no matter what order of evaluation it uses for function arguments, would always evaluate the second (and thus last) usage of bob in this context as an rvalue-reference, whether that was the left hand side, or right hand side of the operator+.

like image 514
jonesmz Avatar asked Aug 18 '17 19:08

jonesmz


2 Answers

Here's a piece of perfectly valid existing code that would be broken by your change:

// launch a thread that does the calculation, moving v to the thread, and
// returns a future for the result
std::future<Foo> run_some_async_calculation_on_vector(std::pmr::vector<int> v); 

std::future<Foo> run_some_async_calculation() {
    char buffer[2000];
    std::pmr::monotonic_buffer_resource rsrc(buffer, 2000);
    std::pmr::vector<int> vec(&rsrc);
    // fill vec
    return run_some_async_calculation_on_vector(vec);
}

Move constructing a container always propagates its allocator, but copy constructing one doesn't have to, and polymorphic_allocator is an allocator that doesn't propagate on container copy construction. Instead, it always reverts to the default memory resource.

This code is safe with copying because run_some_async_calculation_on_vector receives a copy allocated from the default memory resource (which hopefully persists throughout the thread's lifetime), but is completely broken by a move, because then it would have kept rsrc as the memory resource, which will disappear once run_some_async_calculation returns.

like image 122
T.C. Avatar answered Nov 15 '22 05:11

T.C.


The answer to your question is because the standard says it's not allowed to. The compiler can only do that optimization under the as if rule. String has a large constructor and so the compiler isn't going to do the verification it would need to.

To build on this point a bit: all that it takes to write code that "breaks" under this optimization is to have the two different versions of foo print different things. That's it. The compiler produces a program that prints something different than the standard says that it should. That's a compiler bug. Note that RVO does not fall under this category because it is specifically addressed by the standard.

It might make more sense to ask why the standard doesn't say so, e.g.why not extend the rule governing returning at the end of a function, which is implicitly treated as an rvalue. The answer is most likely because it rapidly becomes complicated to define correct behavior. What do you do if the last line were return foo(bob) + foo(bob)? And so on.

like image 31
Nir Friedman Avatar answered Nov 15 '22 05:11

Nir Friedman