Consider the following code:
#include <iostream> #include <type_traits> struct A { A() {} A(const A&) { std::cout << "Copy" << std::endl; } A(A&&) { std::cout << "Move" << std::endl; } }; template <class T> struct B { T x; }; #define MAKE_B(x) B<decltype(x)>{ x } template <class T> B<T> make_b(T&& x) { return B<T> { std::forward<T>(x) }; } int main() { std::cout << "Macro make b" << std::endl; auto b1 = MAKE_B( A() ); std::cout << "Non-macro make b" << std::endl; auto b2 = make_b( A() ); }
This outputs the following:
Macro make b
Non-macro make b
Move
Note that b1 is constructed without a move, but the construction of b2 requires a move.
I also need to type deduction, as A
in real life usage may be a complex type which is difficult to write explicitly. I also need to be able to nest calls (i.e. make_c(make_b(A()))
).
Is such a function possible?
Further thoughts:
N3290 Final C++0x draft page 284:
This elision of copy/move operations, called copy elision, is permitted in the following circumstances:
when a temporary class object that has not been bound to a reference (12.2) would be copied/moved to a class object with the same cv-unqualified type, the copy/move operation can be omitted by constructing the temporary object directly into the target of the omitted copy/move
Unfortunately this seems that we can't elide copies (and moves) of function parameters to function results (including constructors) as those temporaries are either bound to a reference (when passed by reference) or no longer temporaries (when passed by value). It seems the only way to elide all copies when creating a composite object is to create it as an aggregate. However, aggregates have certain restrictions, such as requiring all members be public, and no user defined constructors.
I don't think it makes sense for C++ to allow optimizations for POD C-structs aggregate construction but not allow the same optimizations for non-POD C++ class construction.
Is there any way to allow copy/move elision for non-aggregate construction?
My answer:
This construct allows for copies to be elided for non-POD types. I got this idea from David Rodríguez's answer below. It requires C++11 lambdas. In this example below I've changed make_b
to take two arguments to make things less trivial. There are no calls to any move or copy constructors.
#include <iostream> #include <type_traits> struct A { A() {} A(const A&) { std::cout << "Copy" << std::endl; } A(A&&) { std::cout << "Move" << std::endl; } }; template <class T> class B { public: template <class LAMBDA1, class LAMBDA2> B(const LAMBDA1& f1, const LAMBDA2& f2) : x1(f1()), x2(f2()) { std::cout << "I'm a non-trivial, therefore not a POD.\n" << "I also have private data members, so definitely not a POD!\n"; } private: T x1; T x2; }; #define DELAY(x) [&]{ return x; } #define MAKE_B(x1, x2) make_b(DELAY(x1), DELAY(x2)) template <class LAMBDA1, class LAMBDA2> auto make_b(const LAMBDA1& f1, const LAMBDA2& f2) -> B<decltype(f1())> { return B<decltype(f1())>( f1, f2 ); } int main() { auto b1 = MAKE_B( A(), A() ); }
If anyone knows how to achieve this more neatly I'd be quite interested to see it.
Previous discussion:
This somewhat follows on from the answers to the following questions:
Can creation of composite objects from temporaries be optimised away?
Avoiding need for #define with expression templates
Eliminating unnecessary copies when building composite objects
Guaranteed copy elision redefines a number of C++ concepts, such that certain circumstances where copies/moves could be elided don't actually provoke a copy/move at all. The compiler isn't eliding a copy; the standard says that no such copying could ever happen.
GCC provides the -fno-elide-constructors option to disable copy-elision. This option is useful to observe (or not observe) the effects of return value optimization or other optimizations where copies are elided. It is generally not recommended to disable this important optimization.
Copy elision is an optimization implemented by most compilers to prevent extra (potentially expensive) copies in certain situations. It makes returning by value or pass-by-value feasible in practice (restrictions apply).
As Anthony has already mentioned, the standard forbids copy elision from the argument of a function to the return of the same function. The rationale that drives that decision is that copy elision (and move elision) is an optimization by which two objects in the program are merged into the same memory location, that is, the copy is elided by having both objects be one. The (partial) standard quote is below, followed by a set of circumstances under which copy elision is allowed, which do not include that particular case.
So what makes that particular case different? The difference is basically that the fact that there is a function call between the original and the copied objects, and the function call implies that there are extra constraints to consider, in particular the calling convention.
Given a function T foo( T )
, and a user calling T x = foo( T(param) );
, in the general case, with separate compilation, the compiler will create an object $tmp1
in the location that the calling convention requires the first argument to be. It will then call the function and initialize x
from the return statement. Here is the first opportunity for copy elision: by carefully placing x
on the location where the returned temporary is, x
and the returned object from foo
become a single object, and that copy is elided. So far so good. The problem is that the calling convention in general will not have the returned object and the parameter in the same location, and because of that, $tmp1
and x
cannot be a single location in memory.
Without seeing the function definition the compiler cannot possibly know that the only purpose of the argument to the function is to serve as return statement, and as such it cannot elide that extra copy. It can be argued that if the function is inline
then the compiler would have the missing extra information to understand that the temporary used to call the function, the returned value and x
are a single object. The problem is that that particular copy can only be elided if the code is actually inlined (not only if it is marked as inline
but actually inlined) If a function call is required, then the copy cannot be elided. If the standard allowed that copy to be elided when the code is inlined, it would imply that the behavior of a program would differ due to the compiler and not user code --the inline
keyword does not force inlining, it only means that multiple definitions of the same function do not represent a violation of the ODR.
Note that if the variable was created inside the function (as compared to passed into it) as in: T foo() { T tmp; ...; return tmp; } T x = foo();
then both copies can be elided: There is no restriction as of where tmp
has to be created (it is not an input or output parameter to the function so the compiler is able to relocate it anywhere, including the location of the returned type, and on the calling side, x
can as in the previous example be carefully located in the location of that same return statement, which basically means that tmp
, the return statement and x
can be a single object.
As of your particular problem, if you resort to a macro, the code is inlined, there are no restrictions on the objects and the copy can be elided. But if you add a function, you cannot elide the copy from the argument to the return statement. So just avoid it. Instead of using a template that will move the object, create a template that will construct an object:
template <typename T, typename... Args> T create( Args... x ) { return T( x... ); }
And that copy can be elided by the compiler.
Note that I have not dealt with move construction, as you seem concerned on the cost of even move construction, even though I believe that you are barking at the wrong tree. Given a motivating real use case, I am quite sure that people here will come up with a couple of efficient ideas.
12.8/31
When certain criteria are met, an implementation is allowed to omit the copy/move construction of a class object, even if the copy/move constructor and/or destructor for the object have side effects. In such cases, the implementation treats the source and target of the omitted copy/move operation as simply two different ways of referring to the same object, and the destruction of that object occurs at the later of the times when the two objects would have been destroyed without the optimization.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With