Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best form for constructors? Pass by value or reference?

I'm wondering the best form for my constructors. Here is some sample code:

class Y { ... }  class X { public:   X(const Y& y) : m_y(y) {} // (a)   X(Y y) : m_y(y) {} // (b)   X(Y&& y) : m_y(std::forward<Y>(y)) {} // (c)    Y m_y; }  Y f() { return ... }  int main() {   Y y = f();   X x1(y); // (1)   X x2(f()); // (2) } 

From what I understand, this is the best the compiler can do in each situation.

(1a) y is copied into x1.m_y (1 copy)

(1b) y is copied into the argument of the constructor of X, and then copied into x1.m_y (2 copies)

(1c) y is moved into x1.m_y (1 move)

(2a) result of f() is copied into x2.m_y (1 copy)

(2b) f() is constructed into the argument of the constructor, and then copied to x2.m_y (1 copy)

(2c) f() is created on the stack, and then moved into x2.m_y (1 move)

Now a few questions:

  1. On both counts, pass by const reference is no worse, and sometimes better than pass by value. This seems to go against the discussion on "Want Speed? Pass by Value.". For C++ (not C++0x), should I stick with pass by const reference for these constructors, or should I go pass by value? And for C++0x, should I do pass by rvalue reference over pass by value?

  2. For (2), I'd prefer if the temporary was constructed directly into x.m_y. Even the rvalue version I think requires a move which, unless the object allocates dynamic memory, is as much work as a copy. Is there any way to code this so the compiler is permitted to avoid these copies and moves?

  3. I've made a lot of assumptions in both what I think the compiler can do best and in my questions themselves. Please correct any of these if they are incorrect.

like image 259
Clinton Avatar asked Dec 01 '10 04:12

Clinton


People also ask

Is passing by reference faster than by value?

As a rule of thumb, passing by reference or pointer is typically faster than passing by value, if the amount of data passed by value is larger than the size of a pointer.

Why argument to a copy constructor must be passed as a reference?

It is necessary to pass object as reference and not by value because if you pass it by value its copy is constructed using the copy constructor. This means the copy constructor would call itself to make copy. This process will go on until the compiler runs out of memory.

Can we pass by value to copy constructor?

Passing by value (rather than by reference) means a copy needs to be made. So passing by value into your copy constructor means you need to make a copy before the copy constructor is invoked, but to make a copy you first need to call the copy constructor.

Does passing by reference call copy constructor?

The point of "pass by reference" is to not make a copy as soon as Employee constructor is called, but only when you choose to initialize one of Employee's member with the Date passed.


1 Answers

I've thrown together some examples. I used GCC 4.4.4 in all of this.

Simple case, without -std=c++0x

First, I put together a very simple example with two classes that accept an std::string each.

#include <string> #include <iostream>  struct A /* construct by reference */   {     std::string s_;      A (std::string const &s) : s_ (s)       {         std::cout << "A::<constructor>" << std::endl;       }     A (A const &a) : s_ (a.s_)       {         std::cout << "A::<copy constructor>" << std::endl;       }     ~A ()       {         std::cout << "A::<destructor>" << std::endl;       }   };  struct B /* construct by value */   {     std::string s_;      B (std::string s) : s_ (s)       {         std::cout << "B::<constructor>" << std::endl;       }     B (B const &b) : s_ (b.s_)       {         std::cout << "B::<copy constructor>" << std::endl;       }     ~B ()       {         std::cout << "B::<destructor>" << std::endl;       }   };  static A f () { return A ("string"); } static A f2 () { A a ("string"); a.s_ = "abc"; return a; } static B g () { return B ("string"); } static B g2 () { B b ("string"); b.s_ = "abc"; return b; }  int main ()   {     A a (f ());     A a2 (f2 ());     B b (g ());     B b2 (g2 ());      return 0;   } 

The output of that program on stdout is as follows:

A::<constructor> A::<constructor> B::<constructor> B::<constructor> B::<destructor> B::<destructor> A::<destructor> A::<destructor> 

Conclusion

GCC was able to optimize each and every temporary A or B away. This is consistent with the C++ FAQ. Basically, GCC may (and is willing to) generate code that constructs a, a2, b, b2 in place, even if a function is called that appearantly returns by value. Thereby GCC can avoid many of the temporaries whose existence one might have "inferred" by looking at the code.

The next thing we want to see is how often std::string is actually copied in the above example. Let's replace std::string with something we can observe better and see.

Realistic case, without -std=c++0x

#include <string> #include <iostream>  struct S   {     std::string s_;      S (std::string const &s) : s_ (s)       {         std::cout << "  S::<constructor>" << std::endl;       }     S (S const &s) : s_ (s.s_)       {         std::cout << "  S::<copy constructor>" << std::endl;       }     ~S ()       {         std::cout << "  S::<destructor>" << std::endl;       }   };  struct A /* construct by reference */   {     S s_;      A (S const &s) : s_ (s) /* expecting one copy here */       {         std::cout << "A::<constructor>" << std::endl;       }     A (A const &a) : s_ (a.s_)       {         std::cout << "A::<copy constructor>" << std::endl;       }     ~A ()       {         std::cout << "A::<destructor>" << std::endl;       }   };  struct B /* construct by value */   {     S s_;      B (S s) : s_ (s) /* expecting two copies here */       {         std::cout << "B::<constructor>" << std::endl;       }     B (B const &b) : s_ (b.s_)       {         std::cout << "B::<copy constructor>" << std::endl;       }     ~B ()       {         std::cout << "B::<destructor>" << std::endl;       }   };  /* expecting a total of one copy of S here */ static A f () { S s ("string"); return A (s); }  /* expecting a total of one copy of S here */ static A f2 () { S s ("string"); s.s_ = "abc"; A a (s); a.s_.s_ = "a"; return a; }  /* expecting a total of two copies of S here */ static B g () { S s ("string"); return B (s); }  /* expecting a total of two copies of S here */ static B g2 () { S s ("string"); s.s_ = "abc"; B b (s); b.s_.s_ = "b"; return b; }  int main ()   {     A a (f ());     std::cout << "" << std::endl;     A a2 (f2 ());     std::cout << "" << std::endl;     B b (g ());     std::cout << "" << std::endl;     B b2 (g2 ());     std::cout << "" << std::endl;      return 0;   } 

And the output, unfortunately, meets the expectation:

  S::<constructor>   S::<copy constructor> A::<constructor>   S::<destructor>    S::<constructor>   S::<copy constructor> A::<constructor>   S::<destructor>    S::<constructor>   S::<copy constructor>   S::<copy constructor> B::<constructor>   S::<destructor>   S::<destructor>    S::<constructor>   S::<copy constructor>   S::<copy constructor> B::<constructor>   S::<destructor>   S::<destructor>  B::<destructor>   S::<destructor> B::<destructor>   S::<destructor> A::<destructor>   S::<destructor> A::<destructor>   S::<destructor> 

Conclusion

GCC was not able to optimize away the temporary S created by B's constructor. Using the default copy constructor of S did not change that. Changing f, g to be

static A f () { return A (S ("string")); } // still one copy static B g () { return B (S ("string")); } // reduced to one copy! 

did have the indicated effect. It appears that GCC is willing to construct the argument to B's constructor in place but hesitant to construct B's member in place. Do note that still no temporary A or B are created. That means a, a2, b, b2 are still being constructed in place. Cool.

Let's now investigate how the new move semantics may influence the second example.

Realistic case, with -std=c++0x

Consider adding the following constructor to S

    S (S &&s) : s_ ()       {         std::swap (s_, s.s_);         std::cout << "  S::<move constructor>" << std::endl;       } 

And changing B's constructor to

    B (S &&s) : s_ (std::move (s)) /* how many copies?? */       {         std::cout << "B::<constructor>" << std::endl;       } 

We get this output

  S::<constructor>   S::<copy constructor> A::<constructor>   S::<destructor>    S::<constructor>   S::<copy constructor> A::<constructor>   S::<destructor>    S::<constructor>   S::<move constructor> B::<constructor>   S::<destructor>    S::<constructor>   S::<move constructor> B::<constructor>   S::<destructor>  B::<destructor>   S::<destructor> B::<destructor>   S::<destructor> A::<destructor>   S::<destructor> A::<destructor>   S::<destructor> 

So, we were able to replace four copies with two moves by using pass by rvalue.

But we actually constructed a broken program.

Recall g, g2

static B g ()  { S s ("string"); return B (s); } static B g2 () { S s ("string"); s.s_ = "abc"; B b (s); /* s is zombie now */ b.s_.s_ = "b"; return b; } 

The marked location shows the problem. A move was done on an object that is not a temporary. That's because rvalue references behave like lvalue references except they may also bind to temporaries. So we must not forget to overload B's constructor with one that takes a constant lvalue reference.

    B (S const &s) : s_ (s)       {         std::cout << "B::<constructor2>" << std::endl;       } 

You will then notice that both g, g2 cause "constructor2" to be called, since the symbol s in either case is a better fit for a const reference than for an rvalue reference. We can persuade the compiler to do a move in g in either of two ways:

static B g ()  { return B (S ("string")); } static B g ()  { S s ("string"); return B (std::move (s)); } 

Conclusions

Do return-by-value. The code will be more readable than "fill a reference I give you" code and faster and maybe even more exception safe.

Consider changing f to

static void f (A &result) { A tmp; /* ... */ result = tmp; } /* or */ static void f (A &result) { /* ... */ result = A (S ("string")); } 

That will meet the strong guarantee only if A's assignment provides it. The copy into result cannot be skipped, neither can tmp be constructed in place of result, since result is not being constructed. Thus, it is slower than before, where no copying was necessary. C++0x compilers and move assignment operators would reduce the overhead, but it's still slower than to return-by-value.

Return-by-value provides the strong guarantee more easily. The object is constructed in place. If one part of that fails and other parts have already been constructed, normal unwinding will clean up and, as long as S's constructor fulfills the basic guarantee with regard to its own members and the strong guarantee with regard to global items, the whole return-by-value process actually provides the strong guarantee.

Always pass by value if you're going to copy (onto the stack) anyway

As discussed in Want speed? Pass by value.. The compiler may generate code that constructs, if possible, the caller's argument in place, eliminating the copy, which it cannot do when you take by reference and then copy manually. Principal example: Do NOT write this (taken from cited article)

T& T::operator=(T const& x) // x is a reference to the source {      T tmp(x);          // copy construction of tmp does the hard work     swap(*this, tmp);  // trade our resources for tmp's     return *this;      // our (old) resources get destroyed with tmp  } 

but always prefer this

T& T::operator=(T x)    // x is a copy of the source; hard work already done {     swap(*this, x);  // trade our resources for x's     return *this;    // our (old) resources get destroyed with x } 

If you want to copy to a non-stack frame location pass by const reference pre C++0x and additionally pass by rvalue reference post C++0x

We already saw this. Pass by reference causes less copies to take place when in place construction is impossible than pass by value. And C++0x's move semantics may replace many copies with fewer and cheaper moves. But do keep in mind that moving will make a zombie out of the object that has been moved from. Moving is not copying. Just providing a constructor that accepts rvalue references may break things, as shown above.

If you want to copy to a non-stack frame location and have swap, consider passing by value anyway (pre C++0x)

If you have cheap default construction, that combined with a swap may be more efficient than copying stuff around. Consider S's constructor to be

    S (std::string s) : s_ (/* is this cheap for your std::string? */)       {         s_.swap (s); /* then this may be faster than copying */         std::cout << "  S::<constructor>" << std::endl;       } 
like image 111
15 revs, 2 users 100% Avatar answered Sep 19 '22 23:09

15 revs, 2 users 100%