I'm noticing a large performance difference between std::vector and boost::stable_vector. Below is example where I construct and insert 100,000 ints into both a vector and a stable vector.
test.cpp:
#include <iostream>
#include <vector>
#include <boost/container/stable_vector.hpp>
#include <boost/timer/timer.hpp>
int main(int argc, char** argv) {
int size = 1e5;
boost::timer::cpu_timer timer;
timer.start();
std::vector<int> vec(size);
timer.stop();
std::cout << timer.format();
timer.start();
boost::container::stable_vector<int> svec(size);
timer.stop();
std::cout << timer.format();
}
compile:
g++ -O3 test.cpp -o test -lboost_system-mt -lboost_timer-mt
output:
0.000209s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
5.697013s wall, 5.690000s user + 0.000000s system = 5.690000s CPU (99.9%)
What is the reason for this huge discrepancy? My understanding is that both types should have similar insertion performance.
UPDATE: boost version: 1.54
dev/stable_vector_test: g++ --version
i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I added std::list to the code and tried passing -DNDEBUG in addition to -O3.
dev/stable_vector_test: make
g++ -g test.cpp -o test -lboost_system-mt -lboost_timer-mt
dev/stable_vector_test: ./test
size: 10000
vector: 0.000047s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
list: 0.001168s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
stable_vector: 0.963679s wall, 0.960000s user + 0.000000s system = 0.960000s CPU (99.6%)
dev/stable_vector_test: make opt
g++ -O3 -DNDEBUG test.cpp -o test -lboost_system-mt -lboost_timer-mt
dev/stable_vector_test: ./test
size: 10000
vector: 0.000038s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
list: 0.000659s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
stable_vector: 0.000752s wall, 0.000000s user + 0.000000s system = 0.000000s CPU (n/a%)
So with -O3 and -DNDEBUG I get comparable performance to std::list
Since stable_vector
doesn't use contiguous storage, it seems reasonable that it would take a lot longer than std::vector
to allocate its initial memory.
As noted in a background post on stable_vector, one possible implementation of stable_vector
involves allocating a separate node for each element of the vector. And sure enough, the source code for the stable_vector
constructor shows that it calls resize
, which calls insert
with a pair of iterators, and insert
performs N node allocations:
// (initialization...)
while(first != last){
const node_ptr p = this->priv_get_from_pool();
BOOST_ASSERT(!!p);
//Put it in the index so rollback can return it
//in pool if construct_in_place throws
*it_past_newly_constructed = p;
//Constructs and fixes up pointers This can throw
this->priv_build_node_from_it(p, it_past_newly_constructed, first);
++first;
++it_past_newly_constructed;
}
So it's doing something similar to std::list
, which your data supports.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With