C++ has std::vector and Java has ArrayList, and many other languages have their own form of dynamically allocated array. When a dynamic array runs out of space, it gets reallocated into a larger area and the old values are copied into the new array. A question central to the performance of such an array is how fast the array grows in size. If you always only grow large enough to fit the current push, you'll end up reallocating every time. So it makes sense to double the array size, or multiply it by say 1.5x. Is there an ideal growth factor? 2x? 1.5x? By ideal I mean mathematically justified, best balancing performance and wasted memory. I realize that theoretically, given that your application could have any potential distribution of pushes that this is somewhat application dependent. But I'm curious to know if there's a value that's "usually" best, or is considered best within some rigorous constraint. I've heard there's a paper on this somewhere, but I've been unable to find it.

I remember reading many years ago why 1.5 is preferred over two, at least as applied to C++ (this probably doesn't apply to managed languages, where the runtime system can relocate objects at will). The reasoning is this: <ol> <li>Say you start with a 16-byte allocation.</li> <li>When you need more, you allocate 32 bytes, then free up 16 bytes. This leaves a 16-byte hole in memory.</li> <li>When you need more, you allocate 64 bytes, freeing up the 32 bytes. This leaves a 48-byte hole (if the 16 and 32 were adjacent).</li> <li>When you need more, you allocate 128 bytes, freeing up the 64 bytes. This leaves a 112-byte hole (assuming all previous allocations are adjacent).</li> <li>And so and and so forth.</li> </ol> The idea is that, with a 2x expansion, there is no point in time that the resulting hole is ever going to be large enough to reuse for the next allocation. Using a 1.5x allocation, we have this instead: <ol> <li>Start with 16 bytes.</li> <li>When you need more, allocate 24 bytes, then free up the 16, leaving a 16-byte hole.</li> <li>When you need more, allocate 36 bytes, then free up the 24, leaving a 40-byte hole.</li> <li>When you need more, allocate 54 bytes, then free up the 36, leaving a 76-byte hole.</li> <li>When you need more, allocate 81 bytes, then free up the 54, leaving a 130-byte hole.</li> <li>When you need more, use 122 bytes (rounding up) from the 130-byte hole.</li> </ol>

What is the ideal growth rate for a dynamically allocated array?

Tags:

arrays

arraylist

math

vector

dynamic-arrays

C++ has std::vector and Java has ArrayList, and many other languages have their own form of dynamically allocated array. When a dynamic array runs out of space, it gets reallocated into a larger area and the old values are copied into the new array. A question central to the performance of such an array is how fast the array grows in size. If you always only grow large enough to fit the current push, you'll end up reallocating every time. So it makes sense to double the array size, or multiply it by say 1.5x.

Is there an ideal growth factor? 2x? 1.5x? By ideal I mean mathematically justified, best balancing performance and wasted memory. I realize that theoretically, given that your application could have any potential distribution of pushes that this is somewhat application dependent. But I'm curious to know if there's a value that's "usually" best, or is considered best within some rigorous constraint.

I've heard there's a paper on this somewhere, but I've been unable to find it.

506

asked Jul 08 '09 20:07

Joseph Garvin

1 Answers

I remember reading many years ago why 1.5 is preferred over two, at least as applied to C++ (this probably doesn't apply to managed languages, where the runtime system can relocate objects at will).

The reasoning is this:

Say you start with a 16-byte allocation.
When you need more, you allocate 32 bytes, then free up 16 bytes. This leaves a 16-byte hole in memory.
When you need more, you allocate 64 bytes, freeing up the 32 bytes. This leaves a 48-byte hole (if the 16 and 32 were adjacent).
When you need more, you allocate 128 bytes, freeing up the 64 bytes. This leaves a 112-byte hole (assuming all previous allocations are adjacent).
And so and and so forth.

The idea is that, with a 2x expansion, there is no point in time that the resulting hole is ever going to be large enough to reuse for the next allocation. Using a 1.5x allocation, we have this instead:

Start with 16 bytes.
When you need more, allocate 24 bytes, then free up the 16, leaving a 16-byte hole.
When you need more, allocate 36 bytes, then free up the 24, leaving a 40-byte hole.
When you need more, allocate 54 bytes, then free up the 36, leaving a 76-byte hole.
When you need more, allocate 81 bytes, then free up the 54, leaving a 130-byte hole.
When you need more, use 122 bytes (rounding up) from the 130-byte hole.

178

answered Sep 16 '22 18:09

Chris Jester-Young

Related questions
                            
                                Find length of 2D array Python
                            
                                Unexpected comma using map()
                            
                                Add key value pair to all objects in array
                            
                                How to get all keys from a array that start with a certain string?
                            
                                Converting char[] to byte[]
                            
                                shuffle array in Go
                            
                                Java - Best way to print 2D array? [closed]
                            
                                How to get a list of registered route paths in Laravel?
                            
                                How do I remove an object from an array with JavaScript? [duplicate]
                            
                                Parallel mapM on Repa arrays
                            
                                Why do C and C++ support memberwise assignment of arrays within structs, but not generally?
                            
                                Declaring multiple arrays with 64 elements 1000 times faster than declaring array of 65 elements
                            
                                How does the range-based for work for plain arrays?
                            
                                Correct way to define array of enums in JSON schema
                            
                                VBA array sort function?
                            
                                How to copy a char array in C?
                            
                                Rails not decoding JSON from jQuery correctly (array becoming a hash with integer keys)
                            
                                How to declare an array of strings in C++?
                            
                                How to convert array into comma separated string in javascript [duplicate]
                            
                                What does $0 and $1 mean in Swift Closures?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With