Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient data structure for fast random access, search, insertion and deletion

I'm looking for a data structure (or structures) that would allow me keep me an ordered list of integers, no duplicates, with indexes and values in the same range.

I need four main operations to be efficient, in rough order of importance:

  1. taking the value from a given index
  2. finding the index of a given value
  3. inserting a value at a given index
  4. deleting a value at a given index

Using an array I have 1 at O(1), but 2 is O(N) and insertion and deletions are expensive (O(N) as well, I believe).

A Linked List has O(1) insertion and deletion (once you have the node), but 1 and 2 are O(N) thus negating the gains.

I tried keeping two arrays a[index]=value and b[value]=index, which turn 1 and 2 into O(1) but turn 3 and 4 into even more costly operations.

Is there a data structure better suited for this?

like image 758
Leonel Avatar asked May 20 '09 21:05

Leonel


3 Answers

I would use a red-black tree to map keys to values. This gives you O(log(n)) for 1, 3, 4. It also maintains the keys in sorted order.

For 2, I would use a hash table to map values to keys, which gives you O(1) performance. It also adds O(1) overhead for keeping the hash table updated when adding and deleting keys in the red-black tree.

like image 121
Ayman Hourieh Avatar answered Oct 29 '22 00:10

Ayman Hourieh


How about using a sorted array with binary search?

Insertion and deletion is slow. but given the fact that the data are plain integers could be optimized with calls to memcpy() if you are using C or C++. If you know the maximum size of the array, you can even avoid any memory allocations during the usage of the array, as you can preallocate it to the maximum size.

The "best" approach depends on how many items you need to store and how often you will need to insert/delete compared to finding. If you rarely insert or delete a sorted array with O(1) access to the values is certainly better, but if you insert and delete things frequently a binary tree can be better than the array. For a small enough n the array most likely beats the tree in any case.

If storage size is of concern, the array is better than the trees, too. Trees also need to allocate memory for every item they store and the overhead of the memory allocation can be significant as you only store small values (integers).

You may want to profile what is faster, the copying of the integers if you insert/delete from the sorted array or the tree with it's memory (de)allocations.

like image 20
lothar Avatar answered Oct 28 '22 23:10

lothar


I don't know what language you're using, but if it's Java you can leverage LinkedHashMap or a similar Collection. It's got all of the benefits of a List and a Map, provides constant time for most operations, and has the memory footprint of an elephant. :)

If you're not using Java, the idea of a LinkedHashMap is probably still suitable for a usable data structure for your problem.

like image 42
Rob Hruska Avatar answered Oct 29 '22 01:10

Rob Hruska