Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which "return" method is better for large data in C++/C++11?

This question was triggered by confusion about RVO in C++11.

I have two ways to "return" value: return by value and return via reference parameter. If I don't consider the performance, I prefer the first one. Since return by value is more natural and I can easily distinguish the input and the output. But, if I consider the efficiency when return large data. I can't decide, because in C++11, there is RVO.

Here is my example code, these two codes do the same work:

return by value

struct SolutionType
{
    vector<double> X;
    vector<double> Y;
    SolutionType(int N) : X(N),Y(N) { }
};

SolutionType firstReturnMethod(const double input1,
                               const double input2);
{
    // Some work is here

    SolutionType tmp_solution(N); 
    // since the name is too long, I make alias.
    vector<double> &x = tmp_solution.X;
    vector<double> &y = tmp_solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }

    return tmp_solution;
}

return via reference parameter

void secondReturnMethod(SolutionType& solution,
                        const double input1,
                        const double input2);
{
    // Some work is here        

    // since the name is too long, I make alias.
    vector<double> &x = solution.X;
    vector<double> &y = solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }
}

Here are my questions:

  1. How can I ensure that RVO is happened in C++11?
  2. If we are sure that RVO is happened, in nowadays C++ programming, which "return" method do you recommend? Why?
  3. Why there are some library use the return via reference parameter, code style or historical reason?

UPDATE Thanks to these answers, I know the first method is better in most way.

Here is some useful related links which help me understand this problem:

  1. How to return large data efficiently in C++11
  2. In C++, is it still bad practice to return a vector from a function?
  3. Want Speed? Pass by Value.
like image 208
Regis Avatar asked May 12 '16 13:05

Regis


People also ask

Why do we use return 0 in C programming?

return 0 in the main function means that the program executed successfully. return 1 in the main function means that the program does not execute successfully and there is some error. return 0 means that the user-defined function is returning false.

What is the difference between return 0 and return 1?

in main function return 0 or exit(0) are same but if you write exit(0) in different function then you program will exit from that position. returning different values like return 1 or return -1 means that program is returning error .

What happens if you dont use return 0 in C?

If no return statement appears in a function definition, control automatically returns to the calling function after the last statement of the called function is executed. In this case, the return value of the called function is undefined.

What is the return type of the main function in C++?

The return value of main() function shows how the program exited. The normal exit of program is represented by zero return value. If the code has errors, fault etc., it will be terminated by non-zero value. In C++ language, the main() function can be left without return value.


2 Answers

First of all, the proper technical term for what you are doing is NRVO. RVO relates to temporaries being returned:

X foo() {
   return make_x();
}

NRVO refers to named objects being returned:

X foo() {
    X x = make_x();
    x.do_stuff();
    return x;
}

Second, (N)RVO is compiler optimization, and is not mandated. However, you can be pretty sure that if you use modern compiler, (N)RVOs are going to be used pretty aggressively.

Third of all, (N)RVO is not C++11 feature - it was here long before 2011.

Forth of all, what you have in C++11 is a move constructor. So if your class supports move semantics, it is going to be moved from, not copied, even if (N)RVO is not happening. Unfortunatelly, not everything can be semantically moved efficiently.

Fifth of all, return by reference is a terrible antipattern. It ensures that object will be effectively created twice - first time as 'empty' object, second time when populated with data - and it precludes you from using objects for which 'empty' state is not a valid invariant.

like image 112
SergeyA Avatar answered Sep 28 '22 03:09

SergeyA


SergyA's answer is perfect. If you follow that advice you almost always won't go wrong.

There is however one kind of 'result' where it is better to pass a reference to the result from the call site.

This is in the case where you are using a std container as a result buffer in a loop.

If you take a look at the function std::getline you'll see an example.

std::getline is designed to fill a std::string buffer from the input stream.

Each time getline is called with the same string reference, the string's data is overwritten. Note that over time (assuming random line lengths), there will sometimes need to be an implicit reserve of the string in order to accommodate new long lines. However, shorter lines than the longest so far will not require a reserve, since there will already be enough capacity.

Imagine a version of getline with the following signature:

std::string fictional_getline(std::istream&);

This implies that a new string returned each time the function is called. Whether or not RVO or NRVO occurred, that string will need to be created and if it's longer than the short string optimisation boundary, this will require a memory allocation. Furthermore, the string's memory will be deallocated each time it goes out of scope.

In this case, and others like it, it is much more efficient to pass your result container as a reference.

examples:

void do_processing(const std::string& s)
{
    // ...
}

/// @post: in the case of an error, os.bad() == true
/// @post: in the case of no error, os.bad() == false
std::string fictional_getline(std::istream& stream)
{
    std::string result;
    if (not std::getline(stream, result))
    {
        // what to do here?
    }
    return result;
}

// note that buf is re-used which will require fewer and fewer 
// reallocations the more the loop progresses
void fast_process(std::istream& stream)
{
    std::string buf;
    while(std::getline(std::cin, buf))
    {
        do_processing(buf);
    }
}

// note that buf is re-created and destroyed each time around the loop    
void not_so_fast_process(std::istream& stream)
{
    for(;;)
    {
        auto buf = fictional_getline(stream);
        if (!stream) break;
        do_processing(buf);
    }
}
like image 43
Richard Hodges Avatar answered Sep 28 '22 03:09

Richard Hodges