Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why SortedList<TKey, TValue> doesn't use pointers for the values?

So I was looking through the implementation of SortedList<TKey, TValue> and the implementation of Add (which calls Insert shown below) really surprised me.

The Add method does the obvious binary search to determine the index in which the KVP should go, but the Insert seems as if it could be improved significantly (albeit on larger scales of course):

private void Insert(int index, TKey key, TValue value)
{
  if (this._size == this.keys.Length)
    this.EnsureCapacity(this._size + 1);
  if (index < this._size)
  {
    Array.Copy((Array) this.keys, index, (Array) this.keys, index + 1, this._size - index);
    Array.Copy((Array) this.values, index, (Array) this.values, index + 1, this._size - index);
  }
  this.keys[index] = key;
  this.values[index] = value;
  ++this._size;
  ++this.version;
}

If I'm reading this correctly, and I reserve the right to be wrong at all times, this is an O(2n) operation.

It seems to me that the values should be implemented with pointers. Kind of like a LinkedList in relation to the value from the key, but not linked in that it doesn't support random access. More so the key is simply linked to its value. The get operation wouldn't be any slower, and neither would the remove because we have the pointer, but the add operation would be O(n) now instead.

Can somebody shed some light on why the decision may have gone this direction?

like image 638
Mike Perrenoud Avatar asked Feb 12 '14 14:02

Mike Perrenoud


People also ask

Which class is used to iterate thru a SortedList TKey TValue>?

C# SortedList<TKey, TValue> example Let's see an example of generic SortedList<TKey, TValue> class that stores elements using Add() method and iterates elements using for-each loop. Here, we are using KeyValuePair class to get key and value.

When should you use a SortedDictionary T class rather than a SortedList T class?

SortedDictionary is implemented with Binary Search Tree, while SortedList is implemented with two internal arrays for keys and values, respectively. SortedList is more memory-efficient than SortedDictionary, and SortedList is faster than SortedDictionary when it needs to go through all items at once.

Which collection type represents a collection of key and value pairs that are sorted by keys and are accessible by keys and values?

A SortedList represents a collection of objects stored as key-value pairs that are sorted by the keys.


1 Answers

This should not surprise you, it is well documented in the MSDN article for SortedList:

SortedDictionary has faster insertion and removal operations for unsorted data, O(logn) as opposed to O(n) for SortedList.

SortedDictionary uses a red-black tree (i.e. "pointers"), SortedList is an array. You choose between the two based on what you do with the collection. Both are O(logn) for lookup, but if you iterate the collection frequently then you can be ahead with SortedList a great deal. It uses the cpu caches much more effectively. Makes a huge difference on modern machines.

Also do note that the efficiency of adding items to the collections is heavily dependent on how sorted the items are. A SortedDictionary really likes random data, gives it much better odds of not having to re-balance the trees. Having it sorted gives it worst-case O(n) behavior. SortedList really likes sorted items, makes adding O(1).

like image 91
Hans Passant Avatar answered Nov 09 '22 19:11

Hans Passant