Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r-value Reference Casting and Temporary Materialization

The output for the code below produces:

void doit(const T1 &, const T2 &) [T1 = unsigned long, T2 = int]
t1 == t2
t1 == (T1)t2
t1 != (T1&)t2
t1 == (T1&&)t2

I understand that the t1 == t2 case is simply an integral promotion.

The second case t1 == (T1)t2 is the same thing, just explicit.

The third case t1 == (T1&)t2 must be a reinterpret_cast of some sort... Though, further explanation would be helpful.

The fourth case t1 == (T1&&)t2 is what I am stuck on. I put in the term 'Temporary Materialization' in the question's title as this is the closest I could come to some sort of answer.

Could someone go over these four cases?

Code:

#include <iostream>    

template <typename T1, typename T2>
void doit(const T1& t1, const T2& t2) {
  std::cout << __PRETTY_FUNCTION__ << '\n';

  if (t1 == t2) {
    std::cout << "t1 == t2" << '\n';
  }
  else {
    std::cout << "t1 != t2" << '\n';
  }    

  if (t1 == (T1)t2) {
    std::cout << "t1 == (T1)t2" << '\n';
  }
  else {
    std::cout << "t1 != (T1)t2" << '\n';
  }    

  if (t1 == (T1&)t2) {
    std::cout << "t1 == (T1&)t2" << '\n';
  }
  else {
    std::cout << "t1 != (T1&)t2" << '\n';
  }    

  if (t1 == (T1&&)t2) {
    std::cout << "t1 == (T1&&)t2" << '\n';
  }
  else {
    std::cout << "t1 != (T1&&)t2" << '\n';
  }
}    

int main() {
  const unsigned long a = 1;
  const int b = 1;    

  doit(a, b);    

  return 0;
}
like image 411
Supervisor Avatar asked Feb 14 '18 21:02

Supervisor


2 Answers

Let's look at (T1&&)t2 first. This is indeed a temporary materialization; what happens is that the compiler performs lvalue-to-rvalue conversion on t2 (i.e. accesses its value), casts that value to T1, constructs a temporary of type T1 (with value 1, since that is the value of b and is a valid value of type T1), and binds it to an rvalue reference. Then, in the comparison t1 == (T1&&)t2, both sides are again subject to lvalue-to-rvalue conversion, and since this is valid (both refer to an object of type T1 within its lifetime, the left hand side to a and the right hand side to the temporary) and both sides have value 1, they compare equal.

Note that a materialized temporary of type T1 (say) can bind either to a reference T1&& or T1 const&, so you could try the latter as well in your program.

Also note that while the T1 converted from t2 is a temporary, it would be lifetime extended if you bound it to a local variable (e.g. T1&& r2 = (T1&&)t2;). That would extend the lifetime of the temporary to that of the local reference variable, i.e. to the end of the scope. This is important when considering the "with its lifetime" rule, but here the temporary is destroyed at the end of the expression, which is still after it is accessed by the == comparison.

Next, (T1&)t2 should be interpreted as a static_cast reference binding to a temporary T1 followed by a const_cast; that is, const_cast<T1&>(static_cast<T1 const&>(t2)). The first (inner) cast materializes a temporary T1 with value converted from t2 and binds it to a T1 const& reference, and the second (outer) cast casts away const. Then, the == comparison performs lvalue-to-rvalue conversion on t1 and on the T1& reference; both of these are valid since both refer to an object of type T1 within its lifetime, and since both have value 1 they should compare equal. (Interestingly, the materialized temporary is also a candidate for lifetime extension, but that doesn't matter here.)

However, all major compilers currently fail to spot that they should do the above (Why is (int&)0 ill-formed? Why does this C-style cast not consider static_cast followed by const_cast?) and instead perform a reinterpret_cast (actually a reinterpret_cast to T1 const& followed by a const_cast to T1&). This has undefined behavior (since unsigned long and int are distinct types that are not related by signedness and are not types that can access raw memory), and on platforms where they are different sizes (e.g. Linux) will result in reading stack garbage after b and thus usually print that they are unequal. On Windows, where unsigned long and int are the same size, it will print that they are equal for the wrong reason, which will nevertheless be undefined behavior.

like image 111
ecatmur Avatar answered Oct 21 '22 14:10

ecatmur


The compiler attempts to interpret c-style casts as c++-style casts, in the following order (see cppreference for full details):

  1. const_cast
  2. static_cast
  3. static_cast followed by const_cast
  4. reinterpret_cast
  5. reinterpret_cast followed by const_cast

Interpretation of (T1)t2 is pretty straightforward. const_cast fails, but static_cast works, so it's interpreted as static_cast<T1>(t2) (#2 above).

For (T1&)t2, it's impossible to convert an int& to unsigned long& via static_cast. Both const_cast and static_cast fail, so reinterpret_cast is ultimately used, giving reinterpret_cast<T1&>(t2). To be precise, #5 above, since t2 is const: const_cast<T1&>(reinterpret_cast<const T1&>(t2)).

EDIT: The static_cast for (T1&)t2 fails due to a key line in cppreference: "If the cast can be interpreted in more than one way as static_cast followed by a const_cast, it cannot be compiled.". Implicit conversions are involved, and all of the following are valid (I assume the following overloads exist, at a minimum):

  • T1 c1 = t2; const_cast<T1&>(static_cast<const T1&>(c1))
  • const T1& c1 = t2; const_cast<T1&>(static_cast<const T1&>(c1))
  • T1&& c1 = t2; const_cast<T1&>(static_cast<const T1&>(std::move(c1)))

Note that the actual expression, t1 == (T1&)t2, leads to undefined behavior, as Swift pointed out (assuming sizeof(int) != sizeof(unsigned long)). An address that holds an int is being treated (reinterpreted) as holding an unsigned long. Swap the order of definition of a and b in main(), and the result will change to be equal (on x86 systems with gcc). This is the only case that has undefined behavior, due to a bad reinterpret_cast. Other cases are well defined, with results that are platform specific.

For (T1&&)t2, the conversion is from an int (lvalue) to an unsigned long (xvalue). An xvalue is essentially an lvalue that is "moveable;" it is not a reference. The conversion is static_cast<T1&&>(t2) (#2 above). The conversion is equivalent to std::move((T1)t2), or std:move(static_cast<T1>(t2)). When writing code, use std:move(static_cast<T1>(t2)) instead of static_cast<T1&&>(t2), as the intent is much more clear.

This example shows why c++-style casts should be used instead of c-style casts. Code intent is clear with c++-style casts, as the correct cast is explicitly specified by the developer. With c-style casts, the actual cast is selected by the compiler, and may lead to surprising results.

like image 32
Jay West Avatar answered Oct 21 '22 13:10

Jay West