My understanding of return value optimization is that the compiler secretly passes the address of the object in which the return value will be stored, and makes the changes to that object instead of a local variable. For example, the code <pre class="prettyprint"><code>std::string s = f(); std::string f() { std::string x = "hi"; return x; } </code></pre> Becomes SIMILAR to <pre class="prettyprint"><code>std::string s; f(s); void f(std::string& x) { x = "hi"; } </code></pre> When RVO is used. This means that the interface of the function has changed, as there is an extra hidden parameter. Now consider the following case I stole from Wikipedia <pre class="prettyprint"><code>std::string f(bool cond) { std::string first("first"); std::string second("second"); // the function may return one of two named objects // depending on its argument. RVO might not be applied return cond ? first : second; } </code></pre> Let's assume that a compiler will apply RVO to the first case, but not to this second case. But doesn't the interface of the function change depending on whether RVO was applied? If the body of function <code>f</code> is not visible to the compiler, how does the compiler know whether RVO was applied and whether the caller needs to pass the hidden address parameter?

There's no change in the interface. In all cases, the results of the function must appear in the scope of the caller; typically, the compiler uses a hidden pointer. The only difference is that when RVO is used, as in your first case, the compiler will "merge" <code>x</code> and this return value, constructing <code>x</code> at the address given by the pointer; when it is not used, the compiler will generate a call to the copy constructor in the return statement, to copy whatever into this return value. I might add that your second example is not very close to what happens. At the call site, you get almost always something like: <pre class="prettyprint"><code><raw memory for string> s; f( &s ); </code></pre> And the called function will either construct a local variable or temporary directly at the address it was passed, or copy construct some othe value at this address. So that in your last example, the return statement would be more or less the equivalent of: <pre class="prettyprint"><code>if ( cont ) { std::string::string( s, first ); } else { std::string::string( s, second ); } </code></pre> (Showing the implicit <code>this</code> pointer passed to the copy constructor.) In the first case, if RVO applies, the special code would be in the constructor of <code>x</code>: <pre class="prettyprint"><code>std::string::string( s, "hi" ); </code></pre> and then replacing <code>x</code> with <code>*s</code> everywhere else in the function (and doing nothing at the return).

Lets play with NRVO, RVO and copy elision! Here is a type: <pre class="prettyprint"><code>#include <iostream> struct Verbose { Verbose( Verbose const& ){ std::cout << "copy ctor\n"; } Verbose( Verbose && ){ std::cout << "move ctor\n"; } Verbose& operator=( Verbose const& ){ std::cout << "copy asgn\n"; } Verbose& operator=( Verbose && ){ std::cout << "move asgn\n"; } }; </code></pre> that is pretty verbose. Here is a function: <pre class="prettyprint"><code>Verbose simple() { return {}; } </code></pre> that is pretty simple, and uses direct construction of its return value. If <code>Verbose</code> lacked a copy or move constructor, the above function would work! Here is a function that uses RVO: <pre class="prettyprint"><code>Verbose simple_RVO() { return Verbose(); } </code></pre> here the unnamed <code>Verbose()</code> temporary object is being told to copy itself to the return value. RVO means that the compiler can skip that copy, and directly construct <code>Verbose()</code> into the return value, if and only if there is a copy or move constructor. The copy or move constructor is not called, but rather elided. Here is a function that uses NRVO: <pre class="prettyprint"><code> Verbose simple_NRVO() { Verbose retval; return retval; } </code></pre> For NRVO to occur, every path must return the exact same object, and you can't be sneaky about it (if you cast the return value to a reference, then return that reference, that will block NRVO). In this case, what the compiler does is construct the named object <code>retval</code> directly into the return value location. Similar to RVO, a copy or move constructor must exist, but is not called. Here is a function that fails to use NRVO: <pre class="prettyprint"><code> Verbose simple_no_NRVO(bool b) { Verbose retval1; Verbose retval2; if (b) return retval1; else return retval2; } </code></pre> as there are two possible named objects it could return, it cannot construct both of them in the return value location, so it must do an actual copy. In C++11, the object returned will be implicitly <code>move</code>d instead of copied, as it is a local variable being returned from a function in a simple return statement. So there is at least that. Finally, there is copy elision at the other end: <pre class="prettyprint"><code>Verbose v = simple(); // or simple_RVO, or simple_NRVO, or... </code></pre> When you call a function, you provide it with its arguments, and you inform it where it should put its return value. The caller is responsible for cleaning up the return value and allocating the memory (on the stack) for it. This communication is done in some way via the calling convention, often implicitly (ie, via the stack pointer). Under many calling conventions, the location where the return value can be stored can end up being used as a local variable. In general, if you have a variable of the form: <pre class="prettyprint"><code>Verbose v = Verbose(); </code></pre> the implied copy can be elided -- <code>Verbose()</code> is constructed directly in <code>v</code>, rather than a temporary being created then copied to <code>v</code>. In the same way, the return value of <code>simple</code> (or <code>simple_NRVO</code>, or whatever) can be elided if the run time model of the compiler supports it (and it usually does). Basically, the calling site can tell <code>simple_*</code> to put the return value in a particular spot, and simply treat that spot as the local variable <code>v</code>. Note that NRVO and RVO and implicit move are all done within the function, and the caller needs know nothing about it. Similarly, the eliding at the calling site is all done outside the function, and if the calling convention supports it you do not need any support from the body of the function. This doesn't have to be true in every calling convention and run time model, so the C++ standard makes these optimizations optional.

How does the caller of a function know whether Return Value Optimization was used?

Tags:

c++

My understanding of return value optimization is that the compiler secretly passes the address of the object in which the return value will be stored, and makes the changes to that object instead of a local variable.

For example, the code

std::string s = f();

std::string f()
{
    std::string x = "hi";
    return x;
}

Becomes SIMILAR to

std::string s;
f(s);

void f(std::string& x)
{
    x = "hi";
}

When RVO is used. This means that the interface of the function has changed, as there is an extra hidden parameter.

Now consider the following case I stole from Wikipedia

std::string f(bool cond)
{
    std::string first("first");
    std::string second("second");
    // the function may return one of two named objects
    // depending on its argument. RVO might not be applied
    return cond ? first : second;
}

Let's assume that a compiler will apply RVO to the first case, but not to this second case. But doesn't the interface of the function change depending on whether RVO was applied? If the body of function f is not visible to the compiler, how does the compiler know whether RVO was applied and whether the caller needs to pass the hidden address parameter?

269

asked Sep 05 '13 13:09

Neil Kirk

Video Answer

2 Answers

There's no change in the interface. In all cases, the results of the function must appear in the scope of the caller; typically, the compiler uses a hidden pointer. The only difference is that when RVO is used, as in your first case, the compiler will "merge" x and this return value, constructing x at the address given by the pointer; when it is not used, the compiler will generate a call to the copy constructor in the return statement, to copy whatever into this return value.

I might add that your second example is not very close to what happens. At the call site, you get almost always something like:

<raw memory for string> s;
f( &s );

And the called function will either construct a local variable or temporary directly at the address it was passed, or copy construct some othe value at this address. So that in your last example, the return statement would be more or less the equivalent of:

if ( cont ) {
    std::string::string( s, first );
} else {
    std::string::string( s, second );
}

(Showing the implicit this pointer passed to the copy constructor.) In the first case, if RVO applies, the special code would be in the constructor of x:

std::string::string( s, "hi" );

and then replacing x with *s everywhere else in the function (and doing nothing at the return).

answered Nov 15 '22 18:11

James Kanze

Lets play with NRVO, RVO and copy elision!

Here is a type:

#include <iostream>
struct Verbose {
  Verbose( Verbose const& ){ std::cout << "copy ctor\n"; }
  Verbose( Verbose && ){ std::cout << "move ctor\n"; }
  Verbose& operator=( Verbose const& ){ std::cout << "copy asgn\n"; }
  Verbose& operator=( Verbose && ){ std::cout << "move asgn\n"; }
};

that is pretty verbose.

Here is a function:

Verbose simple() { return {}; }

that is pretty simple, and uses direct construction of its return value. If Verbose lacked a copy or move constructor, the above function would work!

Here is a function that uses RVO:

Verbose simple_RVO() { return Verbose(); }

here the unnamed Verbose() temporary object is being told to copy itself to the return value. RVO means that the compiler can skip that copy, and directly construct Verbose() into the return value, if and only if there is a copy or move constructor. The copy or move constructor is not called, but rather elided.

Here is a function that uses NRVO:

 Verbose simple_NRVO() {
   Verbose retval;
   return retval;
 }

For NRVO to occur, every path must return the exact same object, and you can't be sneaky about it (if you cast the return value to a reference, then return that reference, that will block NRVO). In this case, what the compiler does is construct the named object retval directly into the return value location. Similar to RVO, a copy or move constructor must exist, but is not called.

Here is a function that fails to use NRVO:

 Verbose simple_no_NRVO(bool b) {
   Verbose retval1;
   Verbose retval2;
   if (b)
     return retval1;
   else
     return retval2;
 }

as there are two possible named objects it could return, it cannot construct both of them in the return value location, so it must do an actual copy. In C++11, the object returned will be implicitly moved instead of copied, as it is a local variable being returned from a function in a simple return statement. So there is at least that.

Finally, there is copy elision at the other end:

Verbose v = simple(); // or simple_RVO, or simple_NRVO, or...

When you call a function, you provide it with its arguments, and you inform it where it should put its return value. The caller is responsible for cleaning up the return value and allocating the memory (on the stack) for it.

This communication is done in some way via the calling convention, often implicitly (ie, via the stack pointer).

Under many calling conventions, the location where the return value can be stored can end up being used as a local variable.

In general, if you have a variable of the form:

Verbose v = Verbose();

the implied copy can be elided -- Verbose() is constructed directly in v, rather than a temporary being created then copied to v. In the same way, the return value of simple (or simple_NRVO, or whatever) can be elided if the run time model of the compiler supports it (and it usually does).

Basically, the calling site can tell simple_* to put the return value in a particular spot, and simply treat that spot as the local variable v.

Note that NRVO and RVO and implicit move are all done within the function, and the caller needs know nothing about it.

Similarly, the eliding at the calling site is all done outside the function, and if the calling convention supports it you do not need any support from the body of the function.

This doesn't have to be true in every calling convention and run time model, so the C++ standard makes these optimizations optional.

answered Nov 15 '22 18:11

Yakk - Adam Nevraumont

Related questions
                            
                                Writing debuggers
                            
                                C++11 equivalent of python's x, y, z = array
                            
                                Real numbers - how to determine whether float or double is required?
                            
                                This pointer and performance penalty
                            
                                Visual Studio 2012 C++ compile error with Boost Signal2
                            
                                Restrictions on std::for_each implementation
                            
                                Statically linked app with Qt gives error: Failed to load platform plugin "windows"
                            
                                Continous angles in C++ (eq unwrap function in matlab)
                            
                                VARIANT datatype of C++ into C#
                            
                                How to switch a process between default desktop and Winlogon desktop?
                            
                                Path of least resistance when unit testing C++ code in an exe, in Visual Studio 2012
                            
                                Release mode still dependent on MSVCP110D.dll (C++ MSVS)
                            
                                Forward declaration & circular dependency
                            
                                Why is the move-constructor not called?
                            
                                How to set a watch point for an instance variable?
                            
                                std::thread.join() deadlock
                            
                                Pass by value or rvalue-ref
                            
                                Do I use std::forward or std::move here?
                            
                                Replacing the content of the shared pointer?
                            
                                What algorithm should I use to find the minimum flow on a digraph where there are lower bounds but not upper bounds on flow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With