Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the capacity of std::vector grow automatically? What is the rate?

Tags:

c++

stdvector

I had been going through the Book: C++ Primer, Third Edition By Stanley B. Lippman, Josée Lajoie, found 1 mistake in the program given under the Article 6.3 How a vector Grows Itself, this program missed a "<" in the couts:

#include <vector> #include <iostream>  using namespace std;  int main() {     vector<int> ivec;     cout < "ivec: size: " < ivec.size() < " capacity: "  < ivec.capacity() < endl;          for (int ix = 0; ix < 24; ++ix) {         ivec.push_back(ix);         cout < "ivec: size: " < ivec.size()         < " capacity: "  < ivec.capacity() < endl;     }     } 

Later within that article:

"Under the Rogue Wave implementation, both the size and the capacity of ivec after its definition are 0. On inserting the first element, however, ivec's capacity is 256 and its size is 1."

But, on correcting and running the code i get the following output:


ivec: size: 0 capacity: 0 ivec[0]=0 ivec: size: 1 capacity: 1 ivec[1]=1 ivec: size: 2 capacity: 2 ivec[2]=2 ivec: size: 3 capacity: 4 ivec[3]=3 ivec: size: 4 capacity: 4 ivec[4]=4 ivec: size: 5 capacity: 8 ivec[5]=5 ivec: size: 6 capacity: 8 ivec[6]=6 ivec: size: 7 capacity: 8 ivec[7]=7 ivec: size: 8 capacity: 8 ivec[8]=8 ivec: size: 9 capacity: 16 ivec[9]=9 ivec: size: 10 capacity: 16 ivec[10]=10 ivec: size: 11 capacity: 16 ivec[11]=11 ivec: size: 12 capacity: 16 ivec[12]=12 ivec: size: 13 capacity: 16 ivec[13]=13 ivec: size: 14 capacity: 16 ivec[14]=14 ivec: size: 15 capacity: 16 ivec[15]=15 ivec: size: 16 capacity: 16 ivec[16]=16 ivec: size: 17 capacity: 32 ivec[17]=17 ivec: size: 18 capacity: 32 ivec[18]=18 ivec: size: 19 capacity: 32 ivec[19]=19 ivec: size: 20 capacity: 32 ivec[20]=20 ivec: size: 21 capacity: 32 ivec[21]=21 ivec: size: 22 capacity: 32 ivec[22]=22 ivec: size: 23 capacity: 32 ivec[23]=23 ivec: size: 24 capacity: 32 

Is the capacity increasing with the formula 2^N where N is the initial capacity? Please explain.

like image 388
Sadique Avatar asked Mar 08 '11 12:03

Sadique


People also ask

How does C++ vector Increase Size?

The C++ function std::vector::resize() changes the size of vector. If n is smaller than current size then extra elements are destroyed. If n is greater than current container size then new elements are inserted at the end of vector. If val is specified then new elements are initialed with val.

How the size of a vector increase once it is full?

Explanation: Once the vector is full i.e. number of elements in the vector becomes equal to the capacity of the vector then vector doubles its capacity i.e. if previous capacity was 2 then new capacity becomes 2 * 2 = 4 or 2 + 2 = 4.

Is std::vector fast?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

What is the default capacity of vector in C++?

Standard doesn't specifies anything about initial capacity of vector but most implementations use 0 .


1 Answers

The rate at which the capacity of a vector grows is implementation dependent. Implementations almost invariably choose exponential growth, in order to meet the amortized constant time requirement for the push_back operation. What amortized constant time means and how exponential growth achieves this is interesting.

Every time a vector's capacity is grown the elements need to be copied. If you 'amortize' this cost out over the lifetime of the vector, it turns out that if you increase the capacity by an exponential factor you end up with an amortized constant cost.

This probably seems a bit odd, so let me explain to you how this works...

  • size: 1 capacity 1 - No elements have been copied, the cost per element for copies is 0.
  • size: 2 capacity 2 - When the vector's capacity was increased to 2, the first element had to be copied. Average copies per element is 0.5
  • size: 3 capacity 4 - When the vector's capacity was increased to 4, the first two elements had to be copied. Average copies per element is (2 + 1 + 0) / 3 = 1.
  • size: 4 capacity 4 - Average copies per element is (2 + 1 + 0 + 0) / 4 = 3 / 4 = 0.75.
  • size: 5 capacity 8 - Average copies per element is (3 + 2 + 1 + 1 + 0) / 5 = 7 / 5 = 1.4
  • ...
  • size: 8 capacity 8 - Average copies per element is (3 + 2 + 1 + 1 + 0 + 0 + 0 + 0) / 8 = 7 / 8 = 0.875
  • size: 9 capacity 16 - Average copies per element is (4 + 3 + 2 + 2 + 1 + 1 + 1 + 1 + 0) / 9 = 15 / 9 = 1.67
  • ...
  • size 16 capacity 16 - Average copies per element is 15 / 16 = 0.938
  • size 17 capacity 32 - Average copies per element is 31 / 17 = 1.82

As you can see, every time the capacity jumps, the number of copies goes up by the previous size of the array. But because the array has to double in size before the capacity jumps again, the number of copies per element always stays less than 2.

If you increased the capacity by 1.5 * N instead of by 2 * N, you would end up with a very similar effect, except the upper bound on the copies per element would be higher (I think it would be 3).

I suspect an implementation would choose 1.5 over 2 both to save a bit of space, but also because 1.5 is closer to the golden ratio. I have an intuition (that is currently not backed up by any hard data) that a growth rate in line with the golden ratio (because of its relationship to the fibonacci sequence) will prove to be the most efficient growth rate for real-world loads in terms of minimizing both extra space used and time.

like image 129
Omnifarious Avatar answered Sep 27 '22 19:09

Omnifarious