Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is clang not optimizing this with NRVO?

I'm trying to reason why a reasonably good C++ 11 compiler (clang) is not optimizing this code, and wondering if anybody here has opinions.

#include <iostream>
#define SLOW

struct A {
  A() {}
  ~A() { std::cout << "A d'tor\n"; }
  A(const A&) { std::cout << "A copy\n"; }
  A(A&&) { std::cout << "A move\n"; }
  A &operator =(A) { std::cout << "A copy assignment\n"; return *this; }
};

struct B {
  // Using move on a sink. 
  // Nice talk at Going Native 2013 by Sean Parent.
  B(A foo) : a_(std::move(foo)) {}  
  A a_;
};

A MakeA() {
  return A();
}

B MakeB() {  
 // The key bits are in here
#ifdef SLOW
  A a(MakeA());
  return B(a);
#else
  return B(MakeA());
#endif
}

int main() {
  std::cout << "Hello World!\n";
  B obj = MakeB();
  std::cout << &obj << "\n";
  return 0;
}

If I run this with #define SLOW commented out and optimized with -s I get

Hello World!
A move
A d'tor
0x7fff5fbff9f0
A d'tor

which is expected.

If I run this with #define SLOW enabled and optimized with -s I get:

Hello World!
A copy
A move
A d'tor
A d'tor
0x7fff5fbff9e8
A d'tor

Which obviously isn't as nice. So the question is:

Why am I not seeing a NRVO optimization applied in the "SLOW" case? I know that the compiler is not required to apply NRVO, but this would seem to be such a common simple case.

In general I try to encourage code of the "SLOW" style because I find it much easier to debug.

like image 230
dmaclach Avatar asked Dec 18 '13 04:12

dmaclach


1 Answers

The simple answer is: because it is not allowed to apply copy elision in this case. The compiler is only allowed under very few and specific cases to apply copy elision. The quote from the standard is 12.8 [class.copy] paragraph 31:

... This elision of copy/move operations, called copy elision, is permitted in the following circumstances (which may be combined to eliminate multiple copies):

  • in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value
  • [...]

Clearly the type of B(a) is not A, i.e., copy elision isn't permitted. The other bullets in the same paragraph refer to things like throw expressions, eliding copies from a temporary, and exception declaration. None of these apply.

like image 148
Dietmar Kühl Avatar answered Oct 23 '22 01:10

Dietmar Kühl