I am trying to optimize a C++ routine. The main bottleneck in this routine is the push_back() of a vector of objects. I tried using a deque instead and even tried a list. But strangely (and contrary to theory) deque and list implementations run much slower than the vector counterpart.
In fact even clear() runs much slower for the deque and list implementations than the vector counterpart. In this case too, Vector implementation seems to be the fastest while list implementation is the slowest.
Any pointers?
Note: vector reserve() could have sped the implementation but cannot be done as it is unknown in size.
Thanks.
push_back() function is used to push elements into a vector from the back. The new value is inserted into the vector at the end, after the current last element and the container size is increased by 1.
push_back effectively increases the vector size by one, which causes a reallocation of the internal allocated storage if the vector size was equal to the vector capacity before the call.
With the simple benchmark here, we notice that emplace_back is 7.62% faster than push_back when we insert 1,000,000 object (MyClass) into an vector. Insert 1,000,000 objects.
The deque is a bit slower than the vector, that is logical because here there are more cache misses due to the segmented parts.
vector being faster to build or clear than deque or list is to be expected; it's a simpler data structure.
With regard to vector::push_back
, it has to do two things:
You can generally speed things up by eliminating step 1 by simply resizing the vector and using operator[]
to set items.
UPDATE: Original poster asked for an example. The code below times 128 mega insertions, and outputs
push_back : 2.04s
reserve & push_back : 1.73s
resize & place : 0.48s
when compiled and run with g++ -O3 on Debian/Lenny on an old P4 machine.
#include <iostream>
#include <time.h>
#include <vector>
int main(int,char**)
{
const size_t n=(128<<20);
const clock_t t0=clock();
{
std::vector<unsigned char> a;
for (size_t i=0;i<n;i++) a.push_back(i);
}
const clock_t t1=clock();
{
std::vector<unsigned char> a;
a.reserve(n);
for (size_t i=0;i<n;i++) a.push_back(i);
}
const clock_t t2=clock();
{
std::vector<unsigned char> a;
a.resize(n);
for (size_t i=0;i<n;i++) a[i]=i;
}
const clock_t t3=clock();
std::cout << "push_back : " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
std::cout << "reserve & push_back : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
std::cout << "resize & place : " << (t3-t2)/static_cast<float>(CLOCKS_PER_SEC) << "s" << std::endl;
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With