A data structure supporting O(1) random access and worst-case O(1) append?

Tags:

I realize a resizable indexed collection that uses an array to store its elements (like List<T> in .NET or ArrayList in Java) has amortized O(1) insertion time at the end of the collection. But then there's always that one pesky insertion at critical junctures where the collection has just reached its capacity and the next insertion requires a full copy of all elements in the internal array to a new one (presumably twice as large).

A common mistake (in my opinion) is to go with a linked list to "fix" this problem; but I believe the overhead of allocating a node for every element can be quite wasteful, and in fact would dwarf the benefit of a guaranteed O(1) insertion in that rare case that the array insertion is costly—when, in fact, every other array insertion is significantly cheaper (and faster).

What I was thinking might make sense is a hybrid approach consisting of a linked list of arrays, where every time the current "head" array reaches its capacity, a new array twice as large is added to the linked list. Then no copy would be necessary since the linked list would still have the original array. Essentially this seems analogous (to me) to the List<T> or ArrayList approach, except that wherever you previously would've incurred the cost of copying all the elements of the internal array, here you only incur the cost of allocating a new array plus a single node insertion.

Of course, this would complicate other features if they were desired (e.g., inserting/removing into/from the middle of the collection); but as I've expressed in the title, I'm really just looking for an add-only (and iterate) collection.

Are there any data structures out there ideally suited to this purpose? Or, can you think of one yourself?

623

asked Jan 29 '11 01:01

Dan Tao

1 Answers

There is a beautiful structure called an extendible array that has worst-case O(1) insertion and O(n) memory overhead (that is, it's asymptotically comparable to a dynamic array, but has O(1) worst-case insertion). The trick is to take the approach that the vector uses - doubling and copying - but to make the copying lazy. For example, suppose you have an array of four elements like this one:

[1] [2] [3] [4]

If you want to add a new number, say 5, you begin by allocating an array that's twice as large:

[1] [2] [3] [4] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

Next, you insert 5 into the new array:

[1] [2] [3] [4] [ ] [ ] [ ] [ ] [5] [ ] [ ] [ ]

Finally, pull down the 4 from the old array into the new:

[1] [2] [3] [ ] [ ] [ ] [ ] [4] [5] [ ] [ ] [ ]

From now on, any time you do an insert, add the element to the new array and pull down one more element from the old array. For example, after adding 6, we'd get

[1] [2] [ ] [ ] [ ] [ ] [3] [4] [5] [6] [ ] [ ]

After inserting two more values, we end up here:

[ ] [ ] [ ] [ ] [1] [2] [3] [4] [5] [6] [7] [8]

If we now need to add one more element, we discard the now-empty old array and allocate an array twice as large as the current array (capable of holding 16 elements):

[1] [2] [3] [4] [5] [6] [7] [8] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

And repeat this process. Discounting the cost of a memory allocation (which is usually sublinear in the size of the array), you do at most O(1) work per insertion.

Lookups are still O(1), since you just decide which of the two arrays to look in, while insertions in the middle are O(n) because of the shuffling.

If you're curious, I have a Java implementation of this structure on my personal site. I don't know how useful you'll find it, but you're more than welcome to try it out.

If you want to invest a bit of time reading over a research paper and trying to implement a fairly complex data structure, you can get the same result (worst-case O(1) append) in O(√n) space overhead (which is provably optimal, by the way) using the ideas in this paper. I never got around to actually implementing this, but it's certainly well-worth the read if memory is a super-scarce resource. Interestingly, it uses this above construction as a subroutine!

answered Oct 04 '22 15:10

templatetypedef

Related questions
                            
                                Matrix multiplication using arrays
                            
                                Underscore: remove all key/value pairs from an array of object
                            
                                Best way to sort an array
                            
                                API java 5 and more: should I return an array or a Collection?
                            
                                How to select array index after Where clause using Linq?
                            
                                Remove one column for a numpy array
                            
                                Does fast enumeration in Objective-C guarantee the order of iteration?
                            
                                Sorting two corresponding arrays
                            
                                Push to array reference
                            
                                Cannot use .begin() or .end() on an array
                            
                                Modify a particular row/column of a NumPy array
                            
                                Any way to make jQuery.inArray() case insensitive?
                            
                                Repeating each element of a numpy array 5 times
                            
                                Javascript: Is the length method efficient?
                            
                                Type hinting - specify an array of objects
                            
                                returning numpy arrays via pybind11
                            
                                How to call an element in a numpy array?
                            
                                How to create an array of thread objects in C++11?
                            
                                What is the difference between String[] and String... in Java?
                            
                                How to insert a new element at any position of a JS array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

A data structure supporting O(1) random access and worst-case O(1) append?

Tags:

performance

arrays

language-agnostic

big-o

data-structures

Dan Tao

People also ask

1 Answers

templatetypedef

Recent Activity

Donate For Us