Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is vector array doubled?

Why does the classic implementation of Vector (ArrayList for Java people) double its internal array size on each expansion instead of tripling or quadrupling it?

like image 808
TheOne Avatar asked Sep 15 '09 02:09

TheOne


People also ask

How vector increases its size?

By default, the vector increases its capacity by double. However, if an increment is specified in its constructor, Vector will grow in accordance with it in each allocation cycle.

Why does the GROW Method double the size of the array instead of just increase it by one cell?

If you grow the array one element at a time, you end up with quadratic behaviour as you copy the elements from array to array+1 to array+2. Doubling reduces the cost to linear time. This growth strategy gives you get so-called "amortized constant time insertions".


2 Answers

When calculating the average time to insert into a vector, you need to allow for the non-growing inserts and the growing inserts.

Call the total number of operations to insert n items ototal, and the average oaverage.

If you insert n items, and you grow by a factor of A as required, then there are ototal = n + ΣAi [ 0 < i < 1 + lnAn ] operations. In the worst case you use 1/A of the allocated storage.

Intuitively, A = 2 means at worst you have ototal = 2n, so oaverage is O(1), and the worst case you use 50% of the allocated storage.

For a larger A, you have a lower ototal, but more wasted storage.

For a smaller A, ototal is larger, but you don't waste so much storage. As long as it grows geometrically, it's still O(1) amortised insertion time, but the constant will get higher.

For growth factors 1.25 ( red ), 1.5 ( cyan ), 2 ( black ), 3 ( blue ) and 4 ( green ), these graphs show point and average size efficiency ( ratio of size/allocated space; more is better ) on the left and time efficiency ( ratio of insertions / operations; more is better ) on the right for inserting 400,000 items. 100% space efficiency is reached for all growth factors just prior to resizing; the case for A = 2 shows time efficiency between 25% and 50%, and space efficiency about 50%, which is good for most cases:

space and time efficiency graph - C like implementations

For runtimes such as Java, arrays are zero filled, so the number of operations to allocate is proportional to the size of the array. Taking account of this gives reduces the difference between the time efficiency estimates:

space and time efficiency graph - Java like implementations

like image 67
Pete Kirkham Avatar answered Sep 28 '22 04:09

Pete Kirkham


Any multiple is a compromise. Make it too big and you waste too much memory. Make it too small and you waste much time for reallocations and copying. I guess that doubling is there because it works and is very easy to implement. I also saw a proprietary STL-like library that uses 1.5 as multiplier for the same - I guess its developers considered doubling wasting too much memory.

like image 30
sharptooth Avatar answered Sep 28 '22 04:09

sharptooth