I started doing comparisons between:
But then I noticed that even on push_back()
the deque seemed to be faster. I must be doing something wrong, I can't believe a more general container would outperform a particular one.
My code using google benchmark:
#include "benchmark/benchmark.h"
#include <deque>
#include <vector>
#define NUM_INS 1000
static void BM_InsertVector(benchmark::State& state) {
std::vector<int> v;
v.reserve(NUM_INS);
while (state.KeepRunning()) {
state.PauseTiming();
v.clear();
state.ResumeTiming();
for (size_t i = 0; i < NUM_INS; i++)
v.push_back(i);
}
}
BENCHMARK(BM_InsertVector);
static void BM_InsertDeque(benchmark::State& state) {
std::deque<int> v;
while (state.KeepRunning()) {
state.PauseTiming();
v.clear();
state.ResumeTiming();
for (size_t i = 0; i < NUM_INS; i++)
v.push_back(i);
}
}
BENCHMARK(BM_InsertDeque);
BENCHMARK_MAIN();
Results:
Run on (1 X 2592 MHz CPU )
2016-02-18 14:03:47
Benchmark Time(ns) CPU(ns) Iterations
------------------------------------------------
BM_InsertVector 2820 2470 312500
BM_InsertDeque 1872 1563 406977
I notice some differences when playing with the number of elements, but deque always outperforms vector.
EDIT:
compiler: gcc version 5.2.1
compiling with: g++ -O3 -std=c++11 push_front.cpp -lbenchmark -lpthread
I think the -O3
is actually instrumental; when I turn it off I get a slightly worse deque performance.
Deque stands for Double-ended queues that are a sequence of containers specifically used for feature expansion and contraction on both ends, almost similar to Vectors. There is a slight difference between Deque and Vector, which is like Deque is a little more efficient than Vector in terms of insertion and deletion.
A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.
So std::queue - by default - uses std::deque as its internal container, due to that it can only be at best as fast as std::deque (or the underlying container in general), and because being a wrapper it is - depending on the optimization the compiler can do - slower.
Also, deque supports range-based for loop iteration, which is unavailable for stack and queue(both of them require to create temporary copy for iteration). Thereby deque supports all operations much more faster than both stack and queue.
There are basically 3 sources of cost involved in continuously appending elements to a dynamic container:
Let's start with 1. vector
keeps asking for double the memory, and deque
allocates fixed sized chunks (deque
is typically implemented as an array of arrays, with the lower tier arrays being of fixed size). Asking for more memory may take longer than asking for less, but typically unless your heap is very fragmented asking for a big chunk all at once is the fastest way to get some memory. It's probably faster to allocate one meg once, then ask for a kilobyte 1000 times. So it seems clear that vector
will eventually have the advantage here (until the container is so large it's affected by fragmentation). However, this isn't eventually: you asked for only 1000 elements. I wrote the following code http://coliru.stacked-crooked.com/a/418b18ff8a81c1c0. It's not very interesting but it basically uses a trivial allocator that increments a global to see how many allocations are performed.
In the course of your benchmark, vector
asks for memory 11 times, and deque
only 10. deque
keeps asking for the same amount, vector
asks for doubling amounts. As well, vector
must be calling free
10 times. And deque
0. This seems like a small win for deque
.
For internal bookkeeping, vector
has a simpler implementation than deque
. After all, vector
is just a dynamic array, and deque
is an array of arrays and is strictly more complex. So this is clearly a win for vector
.
Finally, elements on the operations themselves. In deque
, nothing is ever moved. With vector
, every new heap allocation also involves moving all the elements. It's probably optimized to use memcpy
for trivial types, but even see, that's 10 calls to memcpy
to copy 1, 2, 4, 8 ... 512 integers. This is clearly a win for deque
.
I can speculate that cranking up to O3
allowed very aggressive inlining of a lot of the more complex codepaths in deque
, reducing the weight of 2. But obviously, unless you do a much more detailed (very careful!) benchmark, you'll never know for sure.
Mostly, this post is to show that it's more complex than simply a specialized container vs a more general one. I will make a prediction though (put my neck out to be cut off, as it were): if you increase the number of elements by even say a factor of 2 or 4, you will not see deque
win anymore. That's because deque
will make 2x or 4x as many heap allocations, but vector will only make 1-2 more.
I may as well note here that deque
is actually kind of an odd data structure; it's theoretically an array of arrays but in many implementations the array is either a certain size, or just one element, whichever is larger. Also, some of it's big O guarantees are nonsense. push_back
is only fixed constant time, because in C++, only operations on the elements themselves count towards the big O. Otherwise it should be clear, that since it's an array of arrays, the top level array will be proportional in size to the number of elements already stored. And eventually that top level array runs out of room, and you have to reallocate it, moving O(N) pointers. So it's not really O(1) push_back
.
I think the vector is slower because you're calling clear()
which, depending on your STL implementation, may be freeing the underlying array storage.
If that's the case, then your reserve()
call isn't helping; and your vector is continuously resizing, which requires every element to be moved to the new, larger, storage.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With