For example, ArrayList
s in Java have a resizing factor of 2. When the array that the ArrayList
is wrapped around runs out of space, all the elements of that array are transferred to a new array that is 2 times the size of the original array.
Since Python lists/arrays are naturally dynamic, what are their resizing factor? Or do they use some other method of scaling? If so, what's that method? What's its asymptotic runtime (big O)?
For CPython, in particular, the overallocation during resize is implemented in objects/listobject.c. It is described as
mild, but ... enough to give linear-time amortized behavior over a long sequence of appends() in the presence of a poorly-performing system realloc().
When the list needs to be grown, it is grown to be approximately 112.5% of the necessary size. In particular, the actual size is new_size + new_size >> 3 + (newsize < 9 ? 3 : 6)
.
To answer your other question: adding an item to the end of an ArrayList
or similar takes amortized O(1) time as long as there is any factor k > 1 such that reallocation always makes an array that is k times as big as the previous one.
The actual factor used doesn't really matter as long as it's bigger than one. Alternative reallocation schemes might have a different complexity. If reallocation adds a constant amount, for example, then adding an item takes amortized O(N) time.
That's why all professionally written dynamically growing arrays grow by some factor each time. The factor provides a fixed upper bound on the proportion of wasted space ((k-1)/k), traded off against a fixed upper bound on the average number of times each item could be copied in a sequence of adds.
Lets say you're using factor k and you've just performed the Nth reallocation. That means you have added about k^N items. How many items have you copied? Let the be C. Then:
C = k^(N-1) + k(N-2) + ... + 1
kC - C = k^N + k^(N-1) - k^(N-1) + k^(N-2) - k^(N-2) ... - 1
kC - C = k^N - 1
C = (k^N-1) / (k-1)
The number of copies per item added, then is C/K^N, which is pretty much 1/(k-1).
So Java's factor of k=2 means there is about one copy per add operation, with up to 50% unused slots, and CPython's factor of k=1.125 means there are about 8 copies per add operation with up to 11% unused slots.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With