So I was looking through the implementation of SortedList<TKey, TValue>
and the implementation of Add
(which calls Insert
shown below) really surprised me.
The Add
method does the obvious binary search to determine the index in which the KVP should go, but the Insert
seems as if it could be improved significantly (albeit on larger scales of course):
private void Insert(int index, TKey key, TValue value)
{
if (this._size == this.keys.Length)
this.EnsureCapacity(this._size + 1);
if (index < this._size)
{
Array.Copy((Array) this.keys, index, (Array) this.keys, index + 1, this._size - index);
Array.Copy((Array) this.values, index, (Array) this.values, index + 1, this._size - index);
}
this.keys[index] = key;
this.values[index] = value;
++this._size;
++this.version;
}
If I'm reading this correctly, and I reserve the right to be wrong at all times, this is an O(2n)
operation.
It seems to me that the values should be implemented with pointers. Kind of like a LinkedList
in relation to the value from the key, but not linked in that it doesn't support random access. More so the key is simply linked to its value. The get operation wouldn't be any slower, and neither would the remove because we have the pointer, but the add operation would be O(n)
now instead.
Can somebody shed some light on why the decision may have gone this direction?
C# SortedList<TKey, TValue> example Let's see an example of generic SortedList<TKey, TValue> class that stores elements using Add() method and iterates elements using for-each loop. Here, we are using KeyValuePair class to get key and value.
SortedDictionary is implemented with Binary Search Tree, while SortedList is implemented with two internal arrays for keys and values, respectively. SortedList is more memory-efficient than SortedDictionary, and SortedList is faster than SortedDictionary when it needs to go through all items at once.
A SortedList represents a collection of objects stored as key-value pairs that are sorted by the keys.
This should not surprise you, it is well documented in the MSDN article for SortedList:
SortedDictionary has faster insertion and removal operations for unsorted data, O(logn) as opposed to O(n) for SortedList.
SortedDictionary uses a red-black tree (i.e. "pointers"), SortedList is an array. You choose between the two based on what you do with the collection. Both are O(logn) for lookup, but if you iterate the collection frequently then you can be ahead with SortedList a great deal. It uses the cpu caches much more effectively. Makes a huge difference on modern machines.
Also do note that the efficiency of adding items to the collections is heavily dependent on how sorted the items are. A SortedDictionary really likes random data, gives it much better odds of not having to re-balance the trees. Having it sorted gives it worst-case O(n) behavior. SortedList really likes sorted items, makes adding O(1).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With