Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is std::deque faster than std::vector for inserting at the end?

Tags:

c++

stl

I started doing comparisons between:

  • inserting at the front of list
  • inserting at the back of a vector
  • inserting at the front of a deque

But then I noticed that even on push_back() the deque seemed to be faster. I must be doing something wrong, I can't believe a more general container would outperform a particular one.

My code using google benchmark:

#include "benchmark/benchmark.h"
#include <deque>
#include <vector>

#define NUM_INS 1000

static void BM_InsertVector(benchmark::State& state) {
    std::vector<int> v;
    v.reserve(NUM_INS);
    while (state.KeepRunning()) {
        state.PauseTiming();
        v.clear();
        state.ResumeTiming();
        for (size_t i = 0; i < NUM_INS; i++)
            v.push_back(i);
    }
}
BENCHMARK(BM_InsertVector);

static void BM_InsertDeque(benchmark::State& state) {
    std::deque<int> v;
    while (state.KeepRunning()) {
        state.PauseTiming();
        v.clear();
        state.ResumeTiming();
        for (size_t i = 0; i < NUM_INS; i++)
            v.push_back(i);
    }
}
BENCHMARK(BM_InsertDeque);

BENCHMARK_MAIN();

Results:

Run on (1 X 2592 MHz CPU )
2016-02-18 14:03:47
Benchmark         Time(ns)    CPU(ns) Iterations
------------------------------------------------
BM_InsertVector       2820       2470     312500                                 
BM_InsertDeque        1872       1563     406977

I notice some differences when playing with the number of elements, but deque always outperforms vector.

EDIT: compiler: gcc version 5.2.1 compiling with: g++ -O3 -std=c++11 push_front.cpp -lbenchmark -lpthread

I think the -O3 is actually instrumental; when I turn it off I get a slightly worse deque performance.

like image 833
Iosif Spulber Avatar asked Feb 18 '16 14:02

Iosif Spulber


People also ask

Is deque more efficient than vector?

Deque stands for Double-ended queues that are a sequence of containers specifically used for feature expansion and contraction on both ends, almost similar to Vectors. There is a slight difference between Deque and Vector, which is like Deque is a little more efficient than Vector in terms of insertion and deletion.

Is std::vector fast?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

Which is faster queue or deque?

So std::queue - by default - uses std::deque as its internal container, due to that it can only be at best as fast as std::deque (or the underlying container in general), and because being a wrapper it is - depending on the optimization the compiler can do - slower.

Why is deque faster?

Also, deque supports range-based for loop iteration, which is unavailable for stack and queue(both of them require to create temporary copy for iteration). Thereby deque supports all operations much more faster than both stack and queue.


2 Answers

There are basically 3 sources of cost involved in continuously appending elements to a dynamic container:

  1. Memory management.
  2. The internal bookkeeping of the container.
  3. Any operations that need to be performed on the elements themselves. Notably; any container that invalidates references on insertion is potentially moving/copying elements around.

Let's start with 1. vector keeps asking for double the memory, and deque allocates fixed sized chunks (deque is typically implemented as an array of arrays, with the lower tier arrays being of fixed size). Asking for more memory may take longer than asking for less, but typically unless your heap is very fragmented asking for a big chunk all at once is the fastest way to get some memory. It's probably faster to allocate one meg once, then ask for a kilobyte 1000 times. So it seems clear that vector will eventually have the advantage here (until the container is so large it's affected by fragmentation). However, this isn't eventually: you asked for only 1000 elements. I wrote the following code http://coliru.stacked-crooked.com/a/418b18ff8a81c1c0. It's not very interesting but it basically uses a trivial allocator that increments a global to see how many allocations are performed.

In the course of your benchmark, vector asks for memory 11 times, and deque only 10. deque keeps asking for the same amount, vector asks for doubling amounts. As well, vector must be calling free 10 times. And deque 0. This seems like a small win for deque.

For internal bookkeeping, vector has a simpler implementation than deque. After all, vector is just a dynamic array, and deque is an array of arrays and is strictly more complex. So this is clearly a win for vector.

Finally, elements on the operations themselves. In deque, nothing is ever moved. With vector, every new heap allocation also involves moving all the elements. It's probably optimized to use memcpy for trivial types, but even see, that's 10 calls to memcpy to copy 1, 2, 4, 8 ... 512 integers. This is clearly a win for deque.

I can speculate that cranking up to O3 allowed very aggressive inlining of a lot of the more complex codepaths in deque, reducing the weight of 2. But obviously, unless you do a much more detailed (very careful!) benchmark, you'll never know for sure.

Mostly, this post is to show that it's more complex than simply a specialized container vs a more general one. I will make a prediction though (put my neck out to be cut off, as it were): if you increase the number of elements by even say a factor of 2 or 4, you will not see deque win anymore. That's because deque will make 2x or 4x as many heap allocations, but vector will only make 1-2 more.

I may as well note here that deque is actually kind of an odd data structure; it's theoretically an array of arrays but in many implementations the array is either a certain size, or just one element, whichever is larger. Also, some of it's big O guarantees are nonsense. push_back is only fixed constant time, because in C++, only operations on the elements themselves count towards the big O. Otherwise it should be clear, that since it's an array of arrays, the top level array will be proportional in size to the number of elements already stored. And eventually that top level array runs out of room, and you have to reallocate it, moving O(N) pointers. So it's not really O(1) push_back.

like image 101
Nir Friedman Avatar answered Oct 04 '22 18:10

Nir Friedman


I think the vector is slower because you're calling clear() which, depending on your STL implementation, may be freeing the underlying array storage.

If that's the case, then your reserve() call isn't helping; and your vector is continuously resizing, which requires every element to be moved to the new, larger, storage.

like image 23
Dominic Dos Santos Avatar answered Oct 04 '22 18:10

Dominic Dos Santos