Is std::deque faster than std::vector for inserting at the end?

Tags:

stl

I started doing comparisons between:

inserting at the front of list
inserting at the back of a vector
inserting at the front of a deque

But then I noticed that even on push_back() the deque seemed to be faster. I must be doing something wrong, I can't believe a more general container would outperform a particular one.

My code using google benchmark:

#include "benchmark/benchmark.h"
#include <deque>
#include <vector>

#define NUM_INS 1000

static void BM_InsertVector(benchmark::State& state) {
    std::vector<int> v;
    v.reserve(NUM_INS);
    while (state.KeepRunning()) {
        state.PauseTiming();
        v.clear();
        state.ResumeTiming();
        for (size_t i = 0; i < NUM_INS; i++)
            v.push_back(i);
    }
}
BENCHMARK(BM_InsertVector);

static void BM_InsertDeque(benchmark::State& state) {
    std::deque<int> v;
    while (state.KeepRunning()) {
        state.PauseTiming();
        v.clear();
        state.ResumeTiming();
        for (size_t i = 0; i < NUM_INS; i++)
            v.push_back(i);
    }
}
BENCHMARK(BM_InsertDeque);

BENCHMARK_MAIN();

Results:

Run on (1 X 2592 MHz CPU )
2016-02-18 14:03:47
Benchmark         Time(ns)    CPU(ns) Iterations
------------------------------------------------
BM_InsertVector       2820       2470     312500                                 
BM_InsertDeque        1872       1563     406977

I notice some differences when playing with the number of elements, but deque always outperforms vector.

EDIT: compiler: gcc version 5.2.1 compiling with: g++ -O3 -std=c++11 push_front.cpp -lbenchmark -lpthread

I think the -O3 is actually instrumental; when I turn it off I get a slightly worse deque performance.

833

asked Feb 18 '16 14:02

2 Answers

There are basically 3 sources of cost involved in continuously appending elements to a dynamic container:

Memory management.
The internal bookkeeping of the container.
Any operations that need to be performed on the elements themselves. Notably; any container that invalidates references on insertion is potentially moving/copying elements around.

Let's start with 1. vector keeps asking for double the memory, and deque allocates fixed sized chunks (deque is typically implemented as an array of arrays, with the lower tier arrays being of fixed size). Asking for more memory may take longer than asking for less, but typically unless your heap is very fragmented asking for a big chunk all at once is the fastest way to get some memory. It's probably faster to allocate one meg once, then ask for a kilobyte 1000 times. So it seems clear that vector will eventually have the advantage here (until the container is so large it's affected by fragmentation). However, this isn't eventually: you asked for only 1000 elements. I wrote the following code http://coliru.stacked-crooked.com/a/418b18ff8a81c1c0. It's not very interesting but it basically uses a trivial allocator that increments a global to see how many allocations are performed.

In the course of your benchmark, vector asks for memory 11 times, and deque only 10. deque keeps asking for the same amount, vector asks for doubling amounts. As well, vector must be calling free 10 times. And deque 0. This seems like a small win for deque.

For internal bookkeeping, vector has a simpler implementation than deque. After all, vector is just a dynamic array, and deque is an array of arrays and is strictly more complex. So this is clearly a win for vector.

Finally, elements on the operations themselves. In deque, nothing is ever moved. With vector, every new heap allocation also involves moving all the elements. It's probably optimized to use memcpy for trivial types, but even see, that's 10 calls to memcpy to copy 1, 2, 4, 8 ... 512 integers. This is clearly a win for deque.

I can speculate that cranking up to O3 allowed very aggressive inlining of a lot of the more complex codepaths in deque, reducing the weight of 2. But obviously, unless you do a much more detailed (very careful!) benchmark, you'll never know for sure.

Mostly, this post is to show that it's more complex than simply a specialized container vs a more general one. I will make a prediction though (put my neck out to be cut off, as it were): if you increase the number of elements by even say a factor of 2 or 4, you will not see deque win anymore. That's because deque will make 2x or 4x as many heap allocations, but vector will only make 1-2 more.

I may as well note here that deque is actually kind of an odd data structure; it's theoretically an array of arrays but in many implementations the array is either a certain size, or just one element, whichever is larger. Also, some of it's big O guarantees are nonsense. push_back is only fixed constant time, because in C++, only operations on the elements themselves count towards the big O. Otherwise it should be clear, that since it's an array of arrays, the top level array will be proportional in size to the number of elements already stored. And eventually that top level array runs out of room, and you have to reallocate it, moving O(N) pointers. So it's not really O(1) push_back.

101

answered Oct 04 '22 18:10

Nir Friedman

I think the vector is slower because you're calling clear() which, depending on your STL implementation, may be freeing the underlying array storage.

If that's the case, then your reserve() call isn't helping; and your vector is continuously resizing, which requires every element to be moved to the new, larger, storage.

answered Oct 04 '22 18:10

Dominic Dos Santos

Related questions
                            
                                Clang vs GCC - Variadic template parameter pack followed by parameter with default value works in GCC 4.8 but not Clang 3.5
                            
                                Container for different functions?
                            
                                gcc 4.9 bug in structures initialization?
                            
                                Is it safe to `free` memory if allocated via overloaded `new[]` that delegates to `malloc`?
                            
                                Linkage of explicit class template instantiation
                            
                                When to prefer templated policy based design over non-templated inheritance based design
                            
                                C++ using std::vector across boundaries [duplicate]
                            
                                Dangling pointer to polymorphic class leads to Undefined Behaviour. Is it true that it can be the source of any corruption imaginable?
                            
                                mingw "too many sections" bug while compiling huge header file in Qt
                            
                                Attach to Android process from Qt Creator
                            
                                Ambiguous call for overloaded function templates - even if one is more specialized?
                            
                                How does Visual Studio 2013 detect buffer overrun
                            
                                C++ and cython - Seeking a design pattern that avoids template limitations
                            
                                What's the point of constexpr end istream (sentinel) iterators?
                            
                                Overload function for both arrays-of-const and pointers-to-const
                            
                                Boost.Asio wrong local endpoint
                            
                                Error using callback in Python
                            
                                std::future still valid after calling get() (which throws an exception)
                            
                                Efficient (time and space complexity) data structure for dense and sparse matrix
                            
                                How to set the default Windows kit (SDK) version?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is std::deque faster than std::vector for inserting at the end?

Tags:

c++

stl

Iosif Spulber

People also ask

2 Answers

Nir Friedman

Dominic Dos Santos

Recent Activity

Donate For Us