Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should we store smart pointers to class instances in large std::vector's for better performance?

When storing a large number of instances of a custom class (not a "simple" class, e.g. not a std::string, not a std::complex, etc.) in a std::vector, should we pick a simple std::vector<X>, or is std::vector<std::unique_ptr<X>> a better choice?

I wrote some benchmark code (extending code from this blog post about C++11 move semantics improvements on C++03), and it seems that vector<unique_ptr<X>> offers better performance for a 1,500,000 items vector. In fact, on a PC with Windows 7 64-bit, Intel Core i5 quad-core CPU and 8 GB of RAM, I got the following results (test.exe 1500):

  1. vector<unique_ptr<MyObject>>: 1.5 seconds
  2. vector<shared_ptr<MyObject>>: 1.6 seconds
  3. vector<MyObject>: 1.8 seconds

So, in C++03 (where std::unique_ptr is not available), it seems that the best choice is vector<shared_ptr<X>>; instead in C++11 the move-semantics-enabled std::unique_ptr seems to offer the best result.

Am I missing something here? Is this a good C++ guideline that in big vectors it's better to store (smart) pointers to class instances than class instances themselves?

Benchmark code follows:

////////////////////////////////////////////////////////////////////////////////
//
// Test vector<X> vs. vector<unique_ptr<X>> vs. vector<shared_ptr<X>>.
//
// Original benchmark code from:
//   http://blogs.msdn.com/b/vcblog/archive/2009/06/23/stl-performance.aspx
//
////////////////////////////////////////////////////////////////////////////////


#include <exception>    // std::invalid_argument
#include <iostream>     // std::cout
#include <memory>       // std::shared_ptr, std::unique_ptr
#include <ostream>      // std::endl
#include <stdexcept>    // std::exception
#include <string>       // std::wstring
#include <utility>      // std::move
#include <vector>       // std::vector

#include <Windows.h>    // Win32 Platform SDK (high performance counters, etc.)

using namespace std;


// Measure time.
class Stopwatch
{
public:

    Stopwatch()
        : m_start(0),
          m_finish(0)
    {
    }

    static void PerfStartup()
    {
        // to confine the test to run on a single processor 
        // in order to get consistent results for all tests.
        SetThreadAffinityMask(GetCurrentThread(), 1);
        SetThreadIdealProcessor(GetCurrentThread(), 0);
        Sleep(1);
    }

    void Start()
    {
        m_finish = 0;
        m_start = Counter();
    }

    void Stop()
    {
        m_finish = Counter();
    }

    // Elapsed time, in seconds
    double ElapsedTime() const
    {
        return (m_finish - m_start) * 1.0 / Frequency();
    }

    void Reset()
    {
        m_start = m_finish = 0;
    }


private:
    long long m_start;
    long long m_finish;

    static long long Counter() 
    {
        LARGE_INTEGER li;
        QueryPerformanceCounter(&li);
        return li.QuadPart;
    }

    static long long Frequency() 
    {
        LARGE_INTEGER li;
        QueryPerformanceFrequency(&li);
        return li.QuadPart;
    }


// Ban copy
private:
    Stopwatch(const Stopwatch&);
    Stopwatch& operator=(const Stopwatch&);
};


// Measure execution time of a block of code.
class ScopedStopwatch
{
public:

    ScopedStopwatch()
    {
        m_sw.Start();
    }

    ~ScopedStopwatch()
    {
        m_sw.Stop();
        cout << "Elapsed time: " << m_sw.ElapsedTime() << " sec" << endl;
    }

private:
    Stopwatch m_sw;

    ScopedStopwatch(const ScopedStopwatch&);
    ScopedStopwatch& operator=(const ScopedStopwatch&);
};


// User Defined Type
class MyObject
{
public:
    wstring name;
    wstring address;
    wstring telephone;
    wstring name2;
    wstring address2;
    wstring telephone2;

    // Default constructor
    MyObject()
    {
    }

    // Copy Constructor
    MyObject(const MyObject& other)
        : name(other.name),
          telephone(other.telephone),
          address(other.address),
          name2(other.name2),
          telephone2(other.telephone2),
          address2(other.address2)
    {
    }

    // Copy assignment operator
    MyObject& operator=(const MyObject& other)
    {
        if (this != &other)
        {
            name = other.name;
            telephone = other.telephone;
            address = other.address;
            name2 = other.name2;
            telephone2 = other.telephone2;
            address2 = other.address2;
        }

        return *this;
    }

    // Move constructor
    MyObject(MyObject&& other)
        : name(move(other.name)),
          telephone(move(other.telephone)),
          address(move(other.address)),
          name2(move(other.name2)),
          telephone2(move(other.telephone2)),
          address2(move(other.address2))
    {
    }

    // Move assignment operator
    MyObject& operator=(MyObject&& other)
    {
        if (this != &other)
        {
            name = move(other.name);
            telephone = move(other.telephone);
            address = move(other.address);
            name2 = move(other.name2);
            telephone2 = move(other.telephone2);
            address2 = move(other.address2);
        }

        return *this;
    }
};


MyObject MakeTestObject()
{
    MyObject obj;
    obj.name = L"Stephan T. Lavavej Stephan T. Lavavej Stephan T. Lavavej";
    obj.telephone = L"314159265 314159265 314159265 314159265 314159265";
    obj.address = L"127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0";
    obj.name2 = L"Mohammad Usman. Mohammad Usman. Mohammad Usman. ";
    obj.telephone2 = L"1234567890 1234567890 1234567890 1234567890 1234567890";
    obj.address2 = L"Republik Of mancunia. Republik Of mancunia Republik Of mancunia";

    return obj;
}


unique_ptr<MyObject> MakeUniqueTestObject()
{
    unique_ptr<MyObject> obj( new MyObject() );
    obj->name = L"Stephan T. Lavavej Stephan T. Lavavej Stephan T. Lavavej";
    obj->telephone = L"314159265 314159265 314159265 314159265 314159265";
    obj->address = L"127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0";
    obj->name2 = L"Mohammad Usman. Mohammad Usman. Mohammad Usman. ";
    obj->telephone2 = L"1234567890 1234567890 1234567890 1234567890 1234567890";
    obj->address2 = L"Republik Of mancunia. Republik Of mancunia Republik Of mancunia";

    return obj;
}


shared_ptr<MyObject> MakeSharedTestObject()

{    
    auto obj = make_shared<MyObject>();
    obj->name = L"Stephan T. Lavavej Stephan T. Lavavej Stephan T. Lavavej";
    obj->telephone = L"314159265 314159265 314159265 314159265 314159265";
    obj->address = L"127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0 127.0.0.0";
    obj->name2 = L"Mohammad Usman. Mohammad Usman. Mohammad Usman. ";
    obj->telephone2 = L"1234567890 1234567890 1234567890 1234567890 1234567890";
    obj->address2 = L"Republik Of mancunia. Republik Of mancunia Republik Of mancunia";

    return obj;
}


void Test(int count)
{
    Stopwatch::PerfStartup();

    cout << "Inserting " << count << " items in vector.\n";


    cout << "\nTesting vector<MyObject>\n";
    {
        ScopedStopwatch sw;

        vector<MyObject> v;
        for (int i = 0; i < count; i++)
        {
            v.push_back(MakeTestObject());
        }
    }


    cout << "\nTesting vector<unique_ptr<MyObject>>\n";
    {
        ScopedStopwatch sw;

        vector<unique_ptr<MyObject>> v;
        for (int i = 0; i < count; i++)
        {
            v.push_back(MakeUniqueTestObject());
        }
    }


    cout << "\nTesting vector<shared_ptr<MyObject>>\n";
    {
        ScopedStopwatch sw;

        vector<shared_ptr<MyObject>> v;
        for (int i = 0; i < count; i++)
        {
            v.push_back(MakeSharedTestObject());
        }
    }
}


int main(int argc, char * argv[])
{
    static const int kExitOk = 0;
    static const int kExitError = 1;

    try
    {
        if (argc != 2)
        {
            throw invalid_argument("Bad syntax. Pass insertion count (x 1,000).");
        }

        const int countK = atoi(argv[1]);
        Test(countK * 1000);

        return kExitOk;
    }
    catch (const exception & e)   
    {
        cerr << "*** ERROR: " << e.what() << endl;
        return kExitError;
    }
}

////////////////////////////////////////////////////////////////////////////////
like image 938
Mr.C64 Avatar asked Nov 14 '12 18:11

Mr.C64


People also ask

Would you prefer to store pointers or references in vectors?

Actually, if you're iterating over the vector more than you're adding objects to it, you will greatly outperform pointer by adding objects, as they will be contiguous in memory. However if you're doing more adding and erasing to and from the vector, than you are iterating over it, pointers will be more efficient.

Is smart pointer slow?

unique_ptr, the default smart pointer in C++ is no slower than direct pointer access. If you require data sharing, then shared_ptr offers you that feature at the cost of reference counting.

What is an advantage of using a smart pointer over a raw pointer?

One of the advantages of smart pointers is, that they ensure due to RAII, that the actual object is deleted. When using a raw pointer, you need to have a delete for every possible exit point, and still an exception will lead to a memory leak. Smart pointers will also free the memory if an exception occurs.


2 Answers

In C++11, if you are not using a move-enabled objects, then you should use a vector of std::unique_ptr<T> from #include <memory>. std::unique_ptr<T> is lighter weight, carries similar semantics as std::shared_ptr<T> but differs in one important area: object ownership is explicit. In the case of vector, the vector owns the objects it contains. Now, if you are using a move-enabled objects, just use a vector of your object because it will generally be "fast enough." All of the STL enabled containers in C++11 make use of move semantics (i.e. yes, slightly slower, but you gain on the productivity side of things). If performance is an issue, you could fall back to std::unqiue_ptr<T> for reasons outlined below.

If you're using pre-C++11, boost::shared_ptr<T> isn't the worse thing you could do and is probably an appropriate transition path until std::unique_ptr<T> becomes available to you. Use of boost::shared_ptr<T> involves an atomic increment, and an assignment of a pointer. Both pretty cheap, but more expensive (and different semantics) than std::unique_ptr<T>.

It doesn't surprise me that Move constructors are more expensive than moving around std::unique_ptr<T> because a move constructor is still allocating an object (even though its guts/contents are being borrowed, moved, relocated), whereas moving a std::unique_ptr<T> is just an integer/pointer assignment. Use of jemalloc(3) would probably cut the cost of Move constructors, but that's only available on *NIX platforms.

Because of that last point, the benchmark isn't entirely apples-to-apples. If you're looking for consistent performance, std::unique_ptr<T> is probably the way to go (no allocations), but if you're looking for a "native" development idiom that facilitates an easy development metholology wherein performance isn't the most important aspect (i.e. productivity is more important than performance), then go with normal objects with move constructors.

like image 121
Sean Avatar answered Nov 05 '22 19:11

Sean


Should we store smart pointers to class instances in large std::vector's for better performance?

Unless your class instances are gigantic and moving or copying them will require a lot of work, I'd consider this too early to worry about it. If each move operation requires moving dozens or hundreds of bytes, it will make a much more significant difference. As it is, the vast majority of the operation is vector overhead anyway.

I'll assume you're using a 64 bit system. Right now, sizeof(MyObject) will, I think, equal 24 (It does here anyway). Otherwise you'll be dealing with unique pointers, which are probably sized at 12 bytes, or shared pointers, which are, I think, are sized at 16.

You're saving about 0.3 seconds for 1,500,000 operations, or about 200 nanoseconds for each operation. Is it really worth it? Are you really going to be dealing with many millions of elements in your vector? Could you simplify the whole thing by just storing a pointer to the vector and sharing that (since you're using move semantics and not copy, you should be able to make this work somehow)?

Seems an awful lot like premature optimization to me. So I'm going to say, No, you shouldn't store a vector of smart pointers to instances instead of a vector to instances. Yet.

like image 22
Wug Avatar answered Nov 05 '22 21:11

Wug