I was benchmarking some STL algorithms, and I was surprised by the time taken by the following code: (I measured the g++ compiled code [no optimizations] with the <code>time</code> command) <pre class="prettyprint"><code>#include <vector> struct vec2{ int x, y; vec2():x(0), y(0) {} }; int main(int argc, char* argv[]){ const int size = 200000000; std::vector<vec2> tab(size); //2.26s // vec2* tab = new vec2[size]; //1.29s // tab[0].x = 0; // delete[] tab; return 0; } </code></pre> The time taken by a vector initialization is 2.26s while a <code>new</code> (and <code>delete</code>) takes 1.29s. What is the vector ctor doing that would take so much longer? <code>new[]</code> calls the constructor on every element, just as the <code>vector</code> ctor would, right? I then compiled with -O3, it went all faster, but there was still a gap between the two codes. (I got respectively 0.83s and 0.75s) Any ideas?

The speed will depend on implementation, but most likely reason for the vector being slower is that vector cannot default-construct its elements. Vector elements are always copy-constructed. For example <pre class="prettyprint"><code>std::vector<vec2> tab(size); </code></pre> is in reality interpreted as <pre class="prettyprint"><code>std::vector<vec2> tab(size, vec2()); </code></pre> i.e. the second argument gets its value from default argument. The vector then allocates raw memory and copies this default-constructed element passed from the outside into every element of the new vector (by using copy-constructor). This could be generally slower than default-constructing each element directly (as <code>new[]</code> does). To illustrate the difference with a code sketch, <code>new vec2[size]</code> is roughly equivalent to <pre class="prettyprint"><code>vec2 *v = (vec2 *) malloc(size * sizeof(vec2)); for (size_t i = 0; i < size; ++i) // Default-construct `v[i]` in place new (&v[i]) vec2(); return v; </code></pre> while <code>vector<vec2>(size)</code> is roughly equivalent to <pre class="prettyprint"><code>vec2 source; // Default-constructed "original" element vec2 *v = (vec2 *) malloc(size * sizeof(vec2)); for (size_t i = 0; i < size; ++i) // Copy-construct `v[i]` in place new (&v[i]) vec2(source); return v; </code></pre> Depending on the implementation, the second approach might turn out slower. The two-times difference in speed is hard to justify though, but benchmarking non-optimized code makes no sense either. The much less significant difference you observed with optimized code is exactly what one might reasonably expect in this case.

Both versions initialize the memory. As several people have pointed out, the vector uses copy construction while the array uses the default constructor. Your compiler appears to optimize the latter better than the former. Note that in Real Life, you rarely want to initialize such a huge array in one fell swoop. (What use are a bunch of zeroes? Obviously you intend to put something else in there eventually... And initializing hundreds of megabytes is very cache-unfriendly.) Instead, you would write something like: <pre class="prettyprint"><code>const int size = 200000000; std::vector<vec2> v; v.reserve(size); </code></pre> Then when you are ready to put a real element into the vector, you use <code>v.push_back(element)</code>. The <code>reserve()</code> allocates memory without initializing it; the <code>push_back()</code> copy-constructs into the reserved space. Alternatively, when you want to put a new element into the vector, you can use <code>v.resize(v.size()+1)</code> and then modify the element <code>v.back()</code>. (This is how a "pool allocator" might work.) Although this sequence will initialize the element and then overwrite it, it will all happen in the L1 cache, which is almost as fast as not initializing it at all. So for a fair comparison, try a large vector (with <code>reserve</code>) vs. an array for creating a sequence of non-identical items. You should find the vector is faster.

Why is vector(size) slower than new[]?

Tags:

c++

memory-management

stl

I was benchmarking some STL algorithms, and I was surprised by the time taken by the following code: (I measured the g++ compiled code [no optimizations] with the time command)

#include <vector>
struct vec2{
    int x, y;
    vec2():x(0), y(0) {}
};
int main(int argc, char* argv[]){
    const int size = 200000000;
    std::vector<vec2> tab(size); //2.26s
//  vec2* tab = new vec2[size]; //1.29s
//  tab[0].x = 0;
//  delete[] tab;
    return 0;
}

The time taken by a vector initialization is 2.26s while a new (and delete) takes 1.29s. What is the vector ctor doing that would take so much longer? new[] calls the constructor on every element, just as the vector ctor would, right?

I then compiled with -O3, it went all faster, but there was still a gap between the two codes. (I got respectively 0.83s and 0.75s)

Any ideas?

411

asked Jun 07 '11 23:06

Zonko

2 Answers

The speed will depend on implementation, but most likely reason for the vector being slower is that vector cannot default-construct its elements. Vector elements are always copy-constructed. For example

std::vector<vec2> tab(size);

is in reality interpreted as

std::vector<vec2> tab(size, vec2());

i.e. the second argument gets its value from default argument. The vector then allocates raw memory and copies this default-constructed element passed from the outside into every element of the new vector (by using copy-constructor). This could be generally slower than default-constructing each element directly (as new[] does).

To illustrate the difference with a code sketch, new vec2[size] is roughly equivalent to

vec2 *v = (vec2 *) malloc(size * sizeof(vec2));

for (size_t i = 0; i < size; ++i)
  // Default-construct `v[i]` in place
  new (&v[i]) vec2();

return v;

while vector<vec2>(size) is roughly equivalent to

vec2 source; // Default-constructed "original" element

vec2 *v = (vec2 *) malloc(size * sizeof(vec2));

for (size_t i = 0; i < size; ++i)
  // Copy-construct `v[i]` in place
  new (&v[i]) vec2(source);

return v;

Depending on the implementation, the second approach might turn out slower.

The two-times difference in speed is hard to justify though, but benchmarking non-optimized code makes no sense either. The much less significant difference you observed with optimized code is exactly what one might reasonably expect in this case.

167

answered Sep 28 '22 08:09

AnT

Both versions initialize the memory.

As several people have pointed out, the vector uses copy construction while the array uses the default constructor. Your compiler appears to optimize the latter better than the former.

Note that in Real Life, you rarely want to initialize such a huge array in one fell swoop. (What use are a bunch of zeroes? Obviously you intend to put something else in there eventually... And initializing hundreds of megabytes is very cache-unfriendly.)

Instead, you would write something like:

const int size = 200000000;
std::vector<vec2> v;
v.reserve(size);

Then when you are ready to put a real element into the vector, you use v.push_back(element). The reserve() allocates memory without initializing it; the push_back() copy-constructs into the reserved space.

Alternatively, when you want to put a new element into the vector, you can use v.resize(v.size()+1) and then modify the element v.back(). (This is how a "pool allocator" might work.) Although this sequence will initialize the element and then overwrite it, it will all happen in the L1 cache, which is almost as fast as not initializing it at all.

So for a fair comparison, try a large vector (with reserve) vs. an array for creating a sequence of non-identical items. You should find the vector is faster.

answered Sep 28 '22 09:09

Nemo

Related questions
                            
                                what is the difference between friend function and friend class?
                            
                                Help compiling and using boost c++ libraries
                            
                                C++ STL:: what's the difference between inplace_merge and sort
                            
                                Reverse engineering C++ - best tools and approach [closed]
                            
                                Low memory image resizing
                            
                                Structure tag and name, why does a local variable declared as name compile?
                            
                                Return value from local scope?
                            
                                Quick way to check a range of enum values
                            
                                Unit tests in C++
                            
                                C++ Template Iterator error
                            
                                How can we compute N choose K modulus a prime number without overflow?
                            
                                How to return an fstream (C++0x)
                            
                                Why is there a macro which defines _tmain?
                            
                                Deprecate Typedef
                            
                                Passing a C++ complex array to C
                            
                                How can I define a template class which gives the pointer depth/level of a type?
                            
                                How can you make exception handling fall through multiple catch blocks in a single case?
                            
                                Global Keyboard Hook from windows service
                            
                                How to store a const char* in std :: string?
                            
                                Visual Studio C++: Unit test exe project with google test?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With