Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dictionary<Tkey, TValue>, List<T> and other collections implementation / runtime

I was wondering if there is any good reference (website or even better, a book) where I can find information about the internal implementation of the commonly used collections like

  • Dictionary<TKey, TValue>
  • List<T>
  • Queue<T>
  • Stack<T>
  • etc.

By internal implementation, I mean how they use a dynamic array to store their data, how often do they resize it, what is the time and space complexity for the common operations.

Of course, if anybody feels he can provide this information in this thread, you are more than welcome!

like image 688
zafeiris.m Avatar asked Sep 06 '12 10:09

zafeiris.m


2 Answers

About List<T>:

List<T> has two properties, Capacity and Count that help clarify when the resizing takes place. Capacity is the length of the internal array at any given time and Count is the number of elements added in the List. If you have an estimate of the number of elements to be added in the list, Capacity can be initialized (by choosing the appropriate constructor), which will lead to fewer or no resizings, and thus better performance.

Resizing (i.e. creation of a new larger array and copy of elements one by one to the new array) happens when Add<T>() method is called and the array is already full (Count == Capacity). The Capacity of the new array is doubled (initially, if not set by the user, it starts from 0, then 4 and then it gets doubled each time more space needed):

List<int> list = new List<int>();
//Capacity = 0, Count = 0

list.Add(52);
//Capacity = 4, Count = 1

list.Add(34);
list.Add(2);
list.Add(87);
//Capacity = 4, Count = 4

list.Add(56);
//Capacity = 8, Count = 5

For large n, the time complexity for adding a new element is amortized constant O(1). The lookup by index is constant O(1), and the insertion or removal of an element at a given index is linear O(n) because it involves shifting the rest of elements one position (to the right or left, respectively). The space used for the internal array is of course linear to the elements of the array, varying from n to 2n (or, if this makes sense: Math.Pow(2, Math.Ceiling(Math.Log(n, 2))) :).

About Queue<T> and Stack<T>:

The resizing of Queue's and Stack's internal array works similarly to what described for the List<T>.The common operations are efficient O(1) (internally, indexes are kept for Queue's head and tail elements). So, enqueuing an element in the queue or pushing in the stack needs amortized constant time, dequeuing/poping needs constant time.

About Dictionary<TKey, TValue>:

Dictionary works differently, and it is nicely described here.

like image 165
zafeiris.m Avatar answered Sep 18 '22 13:09

zafeiris.m


The exact implementation details of each of these would take a long explanation (for each one). Instead I would refer you to J. Albahari's book C# 5.0 In a Nutshell.

However, I can give you a table for memory/time consideration for common operations for the Dictonary classes. These performance times are in milliseconds, to perform 50,000 operations on a dictionary with integer keys and values, on a 1.5GHz PC.

Type                Internal         Retrieve by     Memeory         Speed Random        Speed Seq        Speed Retrieval 
                    Structure        Index?          Overhead        Insertion           Insertion        by Key
Unsorted
Dictionary<T>       Hashtable        No              22              30                  30               20
Hashtable           Hashtable        No              38              50                  50               30
ListDictonary       Linked List      No              36              50,000              50,000           50,000
OrderedDictionary   Hashtable +      Yes             59              70                  70               40
                    Array
Sorted
SortedDictionary    Red-Black        No              20              130                 100              120 
<K, V>              Tree
SortedList <K, V>   2xArray          Yes             2               3,300               30               40
SortedList          2xArray          Yes             27              4,500               100              180

Sorry I can't provide this for the others you require.

I hope this is of some use.

like image 22
MoonKnight Avatar answered Sep 20 '22 13:09

MoonKnight