This question was triggered by confusion about RVO in C++11. I have two ways to "return" value: return by value and return via reference parameter. If I don't consider the performance, I prefer the first one. Since return by value is more natural and I can easily distinguish the input and the output. But, if I consider the efficiency when return large data. I can't decide, because in C++11, there is RVO. Here is my example code, these two codes do the same work: return by value <pre class="prettyprint"><code>struct SolutionType { vector<double> X; vector<double> Y; SolutionType(int N) : X(N),Y(N) { } }; SolutionType firstReturnMethod(const double input1, const double input2); { // Some work is here SolutionType tmp_solution(N); // since the name is too long, I make alias. vector<double> &x = tmp_solution.X; vector<double> &y = tmp_solution.Y; for (...) { // some operation about x and y // after that these two vectors become very large } return tmp_solution; } </code></pre> return via reference parameter <pre class="prettyprint"><code>void secondReturnMethod(SolutionType& solution, const double input1, const double input2); { // Some work is here // since the name is too long, I make alias. vector<double> &x = solution.X; vector<double> &y = solution.Y; for (...) { // some operation about x and y // after that these two vectors become very large } } </code></pre> Here are my questions: <ol> <li>How can I ensure that RVO is happened in C++11?</li> <li>If we are sure that RVO is happened, in nowadays C++ programming, which "return" method do you recommend? Why?</li> <li>Why there are some library use the return via reference parameter, code style or historical reason?</li> </ol> UPDATE Thanks to these answers, I know the first method is better in most way. Here is some useful related links which help me understand this problem: <ol> <li>How to return large data efficiently in C++11</li> <li>In C++, is it still bad practice to return a vector from a function?</li> <li>Want Speed? Pass by Value.</li> </ol>

First of all, the proper technical term for what you are doing is NRVO. RVO relates to temporaries being returned: <pre class="prettyprint"><code>X foo() { return make_x(); } </code></pre> NRVO refers to named objects being returned: <pre class="prettyprint"><code>X foo() { X x = make_x(); x.do_stuff(); return x; } </code></pre> Second, (N)RVO is compiler optimization, and is not mandated. However, you can be pretty sure that if you use modern compiler, (N)RVOs are going to be used pretty aggressively. Third of all, (N)RVO is not C++11 feature - it was here long before 2011. Forth of all, what you have in C++11 is a move constructor. So if your class supports move semantics, it is going to be moved from, not copied, even if (N)RVO is not happening. Unfortunatelly, not everything can be semantically moved efficiently. Fifth of all, return by reference is a terrible antipattern. It ensures that object will be effectively created twice - first time as 'empty' object, second time when populated with data - and it precludes you from using objects for which 'empty' state is not a valid invariant.

Which "return" method is better for large data in C++/C++11?

Tags:

c++

parameter-passing

c++11

return-value

return-value-optimization

This question was triggered by confusion about RVO in C++11.

I have two ways to "return" value: return by value and return via reference parameter. If I don't consider the performance, I prefer the first one. Since return by value is more natural and I can easily distinguish the input and the output. But, if I consider the efficiency when return large data. I can't decide, because in C++11, there is RVO.

Here is my example code, these two codes do the same work:

return by value

struct SolutionType
{
    vector<double> X;
    vector<double> Y;
    SolutionType(int N) : X(N),Y(N) { }
};

SolutionType firstReturnMethod(const double input1,
                               const double input2);
{
    // Some work is here

    SolutionType tmp_solution(N); 
    // since the name is too long, I make alias.
    vector<double> &x = tmp_solution.X;
    vector<double> &y = tmp_solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }

    return tmp_solution;
}

return via reference parameter

void secondReturnMethod(SolutionType& solution,
                        const double input1,
                        const double input2);
{
    // Some work is here        

    // since the name is too long, I make alias.
    vector<double> &x = solution.X;
    vector<double> &y = solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }
}

Here are my questions:

How can I ensure that RVO is happened in C++11?
If we are sure that RVO is happened, in nowadays C++ programming, which "return" method do you recommend? Why?
Why there are some library use the return via reference parameter, code style or historical reason?

UPDATE Thanks to these answers, I know the first method is better in most way.

Here is some useful related links which help me understand this problem:

How to return large data efficiently in C++11
In C++, is it still bad practice to return a vector from a function?
Want Speed? Pass by Value.

208

asked May 12 '16 13:05

Regis

2 Answers

First of all, the proper technical term for what you are doing is NRVO. RVO relates to temporaries being returned:

X foo() {
   return make_x();
}

NRVO refers to named objects being returned:

X foo() {
    X x = make_x();
    x.do_stuff();
    return x;
}

Second, (N)RVO is compiler optimization, and is not mandated. However, you can be pretty sure that if you use modern compiler, (N)RVOs are going to be used pretty aggressively.

Third of all, (N)RVO is not C++11 feature - it was here long before 2011.

Forth of all, what you have in C++11 is a move constructor. So if your class supports move semantics, it is going to be moved from, not copied, even if (N)RVO is not happening. Unfortunatelly, not everything can be semantically moved efficiently.

Fifth of all, return by reference is a terrible antipattern. It ensures that object will be effectively created twice - first time as 'empty' object, second time when populated with data - and it precludes you from using objects for which 'empty' state is not a valid invariant.

112

answered Sep 28 '22 03:09

SergeyA

SergyA's answer is perfect. If you follow that advice you almost always won't go wrong.

There is however one kind of 'result' where it is better to pass a reference to the result from the call site.

This is in the case where you are using a std container as a result buffer in a loop.

If you take a look at the function std::getline you'll see an example.

std::getline is designed to fill a std::string buffer from the input stream.

Each time getline is called with the same string reference, the string's data is overwritten. Note that over time (assuming random line lengths), there will sometimes need to be an implicit reserve of the string in order to accommodate new long lines. However, shorter lines than the longest so far will not require a reserve, since there will already be enough capacity.

Imagine a version of getline with the following signature:

std::string fictional_getline(std::istream&);

This implies that a new string returned each time the function is called. Whether or not RVO or NRVO occurred, that string will need to be created and if it's longer than the short string optimisation boundary, this will require a memory allocation. Furthermore, the string's memory will be deallocated each time it goes out of scope.

In this case, and others like it, it is much more efficient to pass your result container as a reference.

examples:

void do_processing(const std::string& s)
{
    // ...
}

/// @post: in the case of an error, os.bad() == true
/// @post: in the case of no error, os.bad() == false
std::string fictional_getline(std::istream& stream)
{
    std::string result;
    if (not std::getline(stream, result))
    {
        // what to do here?
    }
    return result;
}

// note that buf is re-used which will require fewer and fewer 
// reallocations the more the loop progresses
void fast_process(std::istream& stream)
{
    std::string buf;
    while(std::getline(std::cin, buf))
    {
        do_processing(buf);
    }
}

// note that buf is re-created and destroyed each time around the loop    
void not_so_fast_process(std::istream& stream)
{
    for(;;)
    {
        auto buf = fictional_getline(stream);
        if (!stream) break;
        do_processing(buf);
    }
}

answered Sep 28 '22 03:09

Richard Hodges

Related questions
                            
                                Exact difference between rvalue and lvalue
                            
                                Display the plot values on mouse over. - Detect Scatter points
                            
                                g++ errors when trying to compile c++11 with Rcpp
                            
                                why do I need a constructor function?
                            
                                Pretty print for all classes with ranged-base for loop support
                            
                                Qt: How do I resize an image and maintain its proportions? [duplicate]
                            
                                The snippet compiles with warnings in Coliru, but compiles normally in Ideone. Which one is correct?
                            
                                Boost.Log to file and stdout simultaneously?
                            
                                Start multiple threads without joining
                            
                                C++ code and C version macros
                            
                                How to store persistent handles in V8?
                            
                                AVX2 slower than SSE on Haswell
                            
                                How to use a std::mutex in a class context
                            
                                Query regarding dijkstra algorithm
                            
                                Convert from CFURLRef or CFStringRef to std::string
                            
                                How is `int main(int argc, char* argv<::>)` a valid signature of main? [duplicate]
                            
                                Socket programming, what is FD and SD
                            
                                Sending a sequence of commands and wait for response
                            
                                How can I detect whether a template argument is a noexcept function?
                            
                                How can I make this variadic template code shorter using features from C++14 and C++1z?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With