How is std::vector faster than a plain array?

Tags:

I stumbled across this while benchmarking a circular buffer. Can anyone explain how a std::vector manages to outperform a plain array in this instance?

#include <iostream>
#include <vector>

struct uint_pair {
    unsigned int a, b;
    uint_pair (unsigned int x = 0, unsigned int y = 0) : a(x), b(y) {}
};

struct container {
    unsigned int pos;

#ifdef USE_VECTOR
    std::vector<uint_pair> data;
    container() : pos(0) { data.resize(16); }
#else
    uint_pair data[16];
    container() : pos(0) {}
#endif

    void add(uint_pair val) {
        data[++pos % 16] = val;
    }
};

int main() {
    container c;
    for (unsigned int i = 0; i < 1000000000; i++) c.add(uint_pair{i, i});
    std::cout << c.data[0].a << " " << c.data[0].b << std::endl;
}

These are the results I'm getting using GCC (similar with Clang):

g++ -o bench -std=c++0x -Os main.cpp -D'USE_VECTOR'
real    0m8.757s
user    0m8.750s
sys     0m0.002s

g++ -o bench -std=c++0x -Os main.cpp
real    0m9.215s
user    0m9.209s
sys     0m0.002s

261

asked Oct 04 '14 04:10

amarcus

2 Answers

Here's how you can eliminate the difference. Instead of your add, use a function like this:

void set(unsigned int x, unsigned int y) {
    ++pos;
    data[pos % 16].a = x;
    data[pos % 16].b = y;
}

called like this:

for (unsigned int i = 0; i < 1000000000; i++) c.set(i, i);

This does exactly the same thing as yours, but it avoids semantically creating a temporary object. It looks like when you use a vector the compiler is better able to optimize out the temporary.

$ g++-4.8 -o bench -std=c++11 -Os main.cpp -DUSE_VECTOR
$ time ./bench 
999999999 999999999

real    0m0.635s
user    0m0.630s
sys 0m0.002s

$ g++-4.8 -o bench -std=c++11 -Os main.cpp
$ time ./bench 
999999999 999999999

real    0m0.644s
user    0m0.639s
sys 0m0.002s

On my machine both the set and add methods yield identical performance with vectors. Only the array shows a difference. To further lend credence to optimization, if you compile with -O0 then the array method is slightly faster (but both over 10x slower than with -Os).

This doesn't explain why the compiler treats these two differently. A vector is, after all, backed by an array. Also, an std::array behaves identically to your C-style array.

answered Nov 08 '22 09:11

Adam

One problem is with the placement of the "pos" member in your struct.

For the c-array, remember that it is stored contiguously in memory adjacent to your "pos" member. When data is pushed into the c-array, extra instructions have to be issued to offset into the structure past the "pos" member. However, writing to the vector makes no such restriction since its memory is located elsewhere.

To squeeze out more performance, make sure your hottest data is at the front of a cache line.

Edit:

To get the c-array to perform just as fast as the vector, the c-array must be allocated on 8 byte boundaries on a 64 bit machine. So something like:

uint_pair* data;
unsigned int pos;

container() : pos(0) {
    std::size_t bufSize = sizeof(uint_pair) * 17;
    void* p = new char[bufSize];
    p = std::align(8, sizeof(uint_pair), p, bufSize);
    data = reinterpret_cast<uint_pair*>(p);
}

With a slightly modified add function:

void add(unsigned int x, unsigned int y) {
    auto& ref = data[pos++ % 16];
    ref.a = x;
    ref.b = y;
}

The c-array now times:

real    0m0.735s
user    0m0.730s
sys     0m0.002s

And the std::vector:

real    0m0.743s
user    0m0.736s
sys     0m0.004s

The standard library implementers are pulling out all the stops for you :)

answered Nov 08 '22 10:11

d3coy

Related questions
                            
                                When do I have to free memory?
                            
                                Should I call reset() on my C++ std random distribution to clear hidden state?
                            
                                Fast counting the number of equal bytes between two arrays [duplicate]
                            
                                C++11 range-based for and map : readability [duplicate]
                            
                                VS2012 nmake using v110_xp toolset?
                            
                                Relative path with #pragma comment(lib)
                            
                                Is there a new function type expression format in C++11?
                            
                                How to write an C/C++ application that writes to a /var/log/myapp directory?
                            
                                allocator_traits::construct() vs allocator_traits::allocate()
                            
                                Boost log severity_logger init_from_stream
                            
                                Override macro from the command line
                            
                                Is `char* p=0; std::equal(p,p,p)` well-defined according to the C++ standard?
                            
                                How do I use OpenCV's test framework with CMake?
                            
                                Why isn't mySet.erase(it++) undefined behavior, or is it?
                            
                                splitting string into vector c++
                            
                                How to move elements out of STL priority queue
                            
                                Elegant time print in C++11
                            
                                Defining an object that acts like integer infinity
                            
                                What is the worst case scenario for an unordered_map?
                            
                                Is it possible to have mutually referential C++ templates?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How is std::vector faster than a plain array?

Tags:

c++

performance

c++11

stl

vector

amarcus

People also ask

2 Answers

Adam

d3coy

Recent Activity

Donate For Us