Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does get helper of std::tuple return rvalue reference instead of value

If you look at get, the helper function for std::tuple, you will notice the following overload:

template< std::size_t I, class... Types >
constexpr std::tuple_element_t<I, tuple<Types...> >&&
get( tuple<Types...>&& t );

In other words, it returns an rvalue reference when the input tuple is an rvalue reference itself. Why not return by value, calling move in the function body? My argument is as follows: the return of get will either be bound to a reference, or to a value (it could be bound to nothing I suppose, but this shouldn't be a common use case). If it's bound to a value, then a move construction will anyway occur. So you lose nothing by returning by value. If you bind to a reference, then returning an rvalue reference can actually be unsafe. To show an example:

struct Hello {
  Hello() {
    std::cerr << "Constructed at : " << this << std::endl;
  }

  ~Hello() {
    std::cerr << "Destructed at : " << this << std::endl;
  }

  double m_double;
};

struct foo {
  Hello m_hello;
  Hello && get() && { return std::move(m_hello); }
};

int main() {
  const Hello & x = foo().get();
  std::cerr << x.m_double;
}

When run, this program prints:

Constructed at : 0x7ffc0e12cdc0
Destructed at : 0x7ffc0e12cdc0
0

In other words, x is immediately a dangling reference. Whereas if you just wrote foo like this:

struct foo {
  Hello m_hello;
  Hello get() && { return std::move(m_hello); }
};

This problem would not occur. Furthermore, if you then use foo like this:

Hello x(foo().get());

It doesn't seem like there is any extra overhead whether you return by value, or rvalue reference. I've tested code like this, and it seems like it will quite consistently only perform a single move construction. E.g. if I add a member:

  Hello(Hello && ) { std::cerr << "Moved" << std::endl; }

And I construct x as above, my program only prints "Moved" once regardless of whether I return by value or rvalue reference.

Is there a good reason I'm missing, or is this an oversight?

Note: there is a good related question here: Return value or rvalue reference?. It seems to say that value return is generally preferable in this situation, but the fact that it shows up in the STL makes me curious whether the STL has ignored this reasoning, or if they have special reasons of their own that may not be as applicable generally.

Edit: Someone has suggested this question is a duplicate of Is there any case where a return of a RValue Reference (&&) is useful?. This is not the case; this answer suggests return by rvalue reference as a way to elide copying of data members. As I discuss in detail above, copying will be elided whether you return by value or rvalue reference provided you call move first.

like image 245
Nir Friedman Avatar asked Sep 28 '22 08:09

Nir Friedman


1 Answers

Your example of how this can be used to create a dangling reference is very interesting, but it's important to learn the correct lesson from the example.

Consider a much simpler example, that doesn't have any && anywhere:

const int &x = vector<int>(1) .front();

.front() returns an &-reference to the first element of the new constructed vector. The vector is immediately destroyed of course and you are left with a dangling reference.

The lesson to be learned is that using a const-reference does not, in general, extend the lifetime. It extends the lifetime of non-references. If the right hand side of = is a reference, then you have to take responsibility for lifetimes yourself.

This has always been the case, so it wouldn't make sense for tuple::get to do anything different. tuple::get is permitted to return a reference, just as vector::front has always been.

You talk about move and copy constructors and about speed. The fastest solution is to use no constructors whatsoever. Imagine a function to concatenate two vectors:

vector<int> concat(const vector<int> &l_, const vector<int> &r) {
    vector<int> l(l_);
    l.insert(l.end(), r.cbegin(), r.cend());
    return l;
}

This would allow an optimized extra overload:

vector<int>&& concat(vector<int>&& l, const vector<int> &r) {
    l.insert(l.end(), r.cbegin(), r.cend());
    return l;
}

This optimization keeps the number of constructions to a minimum

   vector<int> a{1,2,3};
   vector<int> b{3,4,5};
   vector<int> c = concat(
     concat(
       concat(
          concat(vector<int>(), a)
       , b)
      , a
   , b);

The final line, with four calls to concat, will only have two constructions: The starting value (vector<int>()) and the move-construct into c. You could have 100 nested calls to concat there, without any extra constructions.

So, returning by && can be faster. Because, yes, moves are faster than copies, but it's even faster still if you can avoid both.

In summary, it's done for speed. Consider using a nested series of get on a tuple-within-a-tuple-within-a-tuple. Also, it allows it to work with types that have neither copy nor move constructors.

And this doesn't introduce any new risks regarding lifetime. The vector<int>().front() "problem" is not a new one.

like image 79
Aaron McDaid Avatar answered Nov 15 '22 09:11

Aaron McDaid