Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this a known pitfall of C++11 for loops?

Let's imagine we have a struct for holding 3 doubles with some member functions:

struct Vector {   double x, y, z;   // ...   Vector &negate() {     x = -x; y = -y; z = -z;     return *this;   }   Vector &normalize() {      double s = 1./sqrt(x*x+y*y+z*z);      x *= s; y *= s; z *= s;      return *this;   }   // ... }; 

This is a little contrived for simplicity, but I'm sure you agree that similar code is out there. The methods allow you to conveniently chain, for example:

Vector v = ...; v.normalize().negate(); 

Or even:

Vector v = Vector{1., 2., 3.}.normalize().negate(); 

Now if we provided begin() and end() functions, we could use our Vector in a new-style for loop, say to loop over the 3 coordinates x, y, and z (you can no doubt construct more "useful" examples by replacing Vector with e.g. String):

Vector v = ...; for (double x : v) { ... } 

We can even do:

Vector v = ...; for (double x : v.normalize().negate()) { ... } 

and also:

for (double x : Vector{1., 2., 3.}) { ... } 

However, the following (it seems to me) is broken:

for (double x : Vector{1., 2., 3.}.normalize()) { ... } 

While it seems like a logical combination of the previous two usages, I think this last usage creates a dangling reference while the previous two are completely fine.

  • Is this correct and Widely appreciated?
  • Which part of the above is the "bad" part, that should be avoided?
  • Would the language be improved by changing the definition of the range-based for loop such that temporaries constructed in the for-expression exist for the duration of the loop?
like image 424
ndkrempel Avatar asked May 15 '12 03:05

ndkrempel


1 Answers

Is this correct and Widely appreciated?

Yes, your understanding of things is correct.

Which part of the above is the "bad" part, that should be avoided?

The bad part is taking an l-value reference to a temporary returned from a function, and binding it to an r-value reference. It is just as bad as this:

auto &&t = Vector{1., 2., 3.}.normalize(); 

The temporary Vector{1., 2., 3.}'s lifetime cannot be extended because the compiler has no idea that the return value from normalize references it.

Would the language be improved by changing the definition of the range-based for loop such that temporaries constructed in the for-expression exist for the duration of the loop?

That would be highly inconsistent with how C++ works.

Would it prevent certain gotchas made by people using chained expressions on temporaries or various lazy-evaluation methods for expressions? Yes. But it would also be require special-case compiler code, as well as be confusing as to why it doesn't work with other expression constructs.

A much more reasonable solution would be some way to inform the compiler that the return value of a function is always a reference to this, and therefore if the return value is bound to a temporary-extending construct, then it would extend the correct temporary. That's a language-level solution though.

Presently (if the compiler supports it), you can make it so that normalize cannot be called on a temporary:

struct Vector {   double x, y, z;   // ...   Vector &normalize() & {      double s = 1./sqrt(x*x+y*y+z*z);      x *= s; y *= s; z *= s;      return *this;   }   Vector &normalize() && = delete; }; 

This will cause Vector{1., 2., 3.}.normalize() to give a compile error, while v.normalize() will work fine. Obviously you won't be able to do correct things like this:

Vector t = Vector{1., 2., 3.}.normalize(); 

But you also won't be able to do incorrect things.

Alternatively, as suggested in the comments, you can make the rvalue reference version return a value rather than a reference:

struct Vector {   double x, y, z;   // ...   Vector &normalize() & {      double s = 1./sqrt(x*x+y*y+z*z);      x *= s; y *= s; z *= s;      return *this;   }   Vector normalize() && {      Vector ret = *this;      ret.normalize();      return ret;   } }; 

If Vector was a type with actual resources to move, you could use Vector ret = std::move(*this); instead. Named return value optimization makes this reasonably optimal in terms of performance.

like image 85
Nicol Bolas Avatar answered Sep 19 '22 18:09

Nicol Bolas