Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic Array with O(1) removal of any element

This question is about a data structure I thought of. It is a dynamic array, like std::vector<> in C++, except the removal algorithm is different.

In a normal dynamic array, when an element is removed, all the remaining elements must be shifted down, which is O(n), unless it's the last element, which will be O(1).

In this one, if any element is removed, it is replaced by the last element. This of course loses ordering of the elements. But now removal of any element is constant time.

A list will have the same removal times, but this structure has random access. The only caveat with that is you don't know what you're accessing, since ordering could be jumbled, so what use is random access anyway. Plus a list won't mess up any pointers/iterators to elements.

So meh, this structure seems rather useless except for the very specific task of strictly walking through elements and perhaps removing them along the way. A list can do the same, but this has better cache performance.

So, does this strange/useless structure have a name, and does it have any uses? Or just a nice little brain storm?

like image 224
GManNickG Avatar asked Jun 29 '09 08:06

GManNickG


2 Answers

This idea is used in Knuth (Fisher–Yates) shuffle. An element picked at random is replaced with the last one in the array. Since what we want is a random permutation anyway, the reordering doesn't matter.

like image 114
Rafał Dowgird Avatar answered Nov 01 '22 01:11

Rafał Dowgird


So, does this strange/useless structure have a name, and does it have any uses?

I've used something similar in simulations of multi-process systems.

In a scheduler for processes implemented as state machines, each process is either waiting for an external event, active or completed. The scheduler has an array of pointers to the processes.

Initially each process is active, and the scheduler has the index of the last waiting and first completed process, initially zero and the length of the array.

V-- waiting
[ A-active, B-active, C-active, D-active ]
                             completed --^
^- run

To step the process to its next state, the scheduler iterates over the array and runs each process in turn. If a process reports that it is waiting, it's swapped with the process after the last waiting process in the array.

           V-- waiting
[ A-waiting, B-active, C-active, D-active ]
                              completed --^
             ^- run

If it reports that it has completed, it's swapped with the process before the first completed array.

           V-- waiting
[ A-waiting, D-active, C-active, B-completed ]
                   completed --^
             ^- run

So as the scheduler runs and processes transition from active to waiting or completed, the array become ordered with all the waiting processes at the start, all the active ones in the middle, and the completed ones at the end.

                      V-- waiting
[ A-waiting, C-waiting, D-active, B-completed ]
                   completed --^
                        ^- run

After either a certain number of iterations, or when there are no more active processes, the completed processes are cleaned out of the array and external events are processed:

                      V-- waiting
[ A-waiting, C-waiting, D-completed, B-completed ]
          completed --^
                        ^- run == completed so stop

This is similar in that it's using swapping to remove items from a collection, but it is removing items from both ends rather and leaving the 'collection' in the middle.

like image 27
Pete Kirkham Avatar answered Nov 01 '22 01:11

Pete Kirkham