Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inefficiency of copy-and-swap idiom?

I was testing some code where there is a std::vector data member inside a class. The class is both copiable and movable, and the operator= is implemented as described here using the copy-and-swap idiom.

If there are two vectors, say v1 with big capacity and v2 with small capacity, and v2 is copied to v1 (v1 = v2), the big capacity in v1 is kept after the assignment; this makes sense, since next v1.push_back() calls don't have to force new reallocations (in other words: freeing already available memory, then reallocating it to grow the vector doesn't make much sense).

But, if the same assignment is done with the class having the vector as data member, the behavior is different, and after the assignment the bigger capacity is not kept.

If the copy-and-swap idiom is not used, and copy operator= and move operator= are implemented separately, then the behavior is as expected (as for ordinary non-member vectors).

Why is that? Should we not follow copy-and-swap idiom and instead implement operator=(const X& other) (copy op=) and operator=(X&& other) (move op=) separately for optimum performance?

This is the output of a reproducible test with copy-and-swap idiom (note how in this case, after x1 = x2, x1.GetV().capacity() is 1,000, not 1,000,000):

C:\TEMP\CppTests>cl /EHsc /W4 /nologo /DTEST_COPY_AND_SWAP test.cpp
test.cpp

C:\TEMP\CppTests>test.exe
v1.capacity() = 1000000
v2.capacity() = 1000

After copy v1 = v2:
v1.capacity() = 1000000
v2.capacity() = 1000

[Copy-and-swap]

x1.GetV().capacity() = 1000000
x2.GetV().capacity() = 1000

After x1 = x2:
x1.GetV().capacity() = 1000
x2.GetV().capacity() = 1000

This is the output without copy-and-swap idiom (note how in this case x1.GetV().capacity() = 1000000, as expected):

C:\TEMP\CppTests>cl /EHsc /W4 /nologo test.cpp
test.cpp

C:\TEMP\CppTests>test.exe
v1.capacity() = 1000000
v2.capacity() = 1000

After copy v1 = v2:
v1.capacity() = 1000000
v2.capacity() = 1000

[Copy-op= and move-op=]

x1.GetV().capacity() = 1000000
x2.GetV().capacity() = 1000

After x1 = x2:
x1.GetV().capacity() = 1000000
x2.GetV().capacity() = 1000

Compilable sample code follows (tested with VS2010 SP1/VC10):

#include <algorithm>
#include <iostream>
#include <vector>
using namespace std;

class X
{
public:
    X()
    {
    }

    explicit X(const size_t initialCapacity)
    {
        m_v.reserve(initialCapacity);
    }

    X(const X& other)
        : m_v(other.m_v)
    {
    }

    X(X&& other)
        : m_v(move(other.m_v))
    {
    }

    void SetV(const vector<double>& v)
    {
        m_v = v;
    }

    const vector<double>& GetV() const
    {
        return m_v;
    }

#ifdef TEST_COPY_AND_SWAP     
    //
    // Implement a unified op= with copy-and-swap idiom.
    //

    X& operator=(X other)
    {
        swap(*this, other);       
        return *this;
    }

    friend void swap(X& lhs, X& rhs)
    {
        using std::swap;

        swap(lhs.m_v, rhs.m_v);
    }    
#else    
    //
    // Implement copy op= and move op= separately.
    //

    X& operator=(const X& other)
    {
        if (this != &other)
        {
            m_v = other.m_v;
        }
        return *this;
    }

    X& operator=(X&& other)
    {
        if (this != &other)
        {
            m_v = move(other.m_v);
        }
        return *this;
    }    
#endif

private:
    vector<double> m_v;
};    

// Test vector assignment from a small vector to a vector with big capacity.
void Test1()
{
    vector<double> v1;
    v1.reserve(1000*1000);

    vector<double> v2(1000);

    cout << "v1.capacity() = " << v1.capacity() << '\n';
    cout << "v2.capacity() = " << v2.capacity() << '\n';

    v1 = v2;
    cout << "\nAfter copy v1 = v2:\n";    
    cout << "v1.capacity() = " << v1.capacity() << '\n';
    cout << "v2.capacity() = " << v2.capacity() << '\n';
}    

// Similar to Test1, but now vector is a data member inside a class.
void Test2()
{
#ifdef TEST_COPY_AND_SWAP 
    cout << "[Copy-and-swap]\n\n";
#else
    cout << "[Copy-op= and move-op=]\n\n";
#endif

    X x1(1000*1000);

    vector<double> v2(1000);
    X x2;
    x2.SetV(v2);

    cout << "x1.GetV().capacity() = " << x1.GetV().capacity() << '\n';
    cout << "x2.GetV().capacity() = " << x2.GetV().capacity() << '\n';

    x1 = x2;
    cout << "\nAfter x1 = x2:\n";
    cout << "x1.GetV().capacity() = " << x1.GetV().capacity() << '\n';
    cout << "x2.GetV().capacity() = " << x2.GetV().capacity() << '\n';
}

int main()
{
    Test1();       
    cout << '\n';    
    Test2();
}
like image 403
Mr.C64 Avatar asked Mar 03 '13 17:03

Mr.C64


1 Answers

Copy-and-swap with a std::vector can indeed lead to performance loss. The main issue here is that copying a std::vector involves two distinct stages:

  1. Allocate new section of memory
  2. Copy stuff in.

Copy-and-swap can eliminate #2 but not #1. Consider what you would observe prior to the swap() call but after the assignment op is entered. You have three vectors- the one which is about to be overwritten, the one which is a copy, and the original argument.

This clearly implies that if the vector which is about to be overwritten had sufficient or excess capacity, there's a waste in the creation of the intermediate vector, and a loss in the extra capacity of the source. Other containers can behave this way as well.

Copy-and-swap is a great baseline, especially when it comes to exception safety, but it's not globally the highest-performant solution. If you're in a tight area, then other more specialized implementations can be more efficient- but be warned, exception-safety in this area is non-trivial, and sometimes impossible if not copy-and-swapped.

like image 74
Puppy Avatar answered Sep 29 '22 14:09

Puppy