Does anyone know a good resource to concisely explain the different types of lists available in C# and when their usage is appropriate?
For example, List, Hashtable, Dictionaries etc.
I'm never quite sure when I should be using what.
We can use an ArrayList class that is present in the System. Collections namespace,. Using it we can store different types because the ArrayList class operates on the object type.
Collection - can add/remove items. List - allows items to have an order (accessing and removing by index)
These aren't all lists, although they're all collections. Here's a quick summary.
Non-generic collections (API is in terms of object
. Values types are boxed.
These are mostly in the System.Collections namespace:
Generic collections. (Strongly-typed API, will not box value types (assuming suitable T).
These are mostly in the System.Collections.Generic namespace:
Possibly the most important collection interface is IEnumerable (and IEnumerable<T>). This represents a sequence of items much like a Stream represents a sequence of bytes. There is no random access, just forward-reading. LINQ to Objects is based on this, and pretty much all collection types implement it.
To expound on tobsen's earlier answer, the C5 Generic Collection Library has a large number of, well, collections. I'll describe some of them here:
Queue/Stack
CircularQueue<T>
: This class provides strictly Queue and Stack functionality. As well, efficient O(1) access to any item in the Stack/Queue is available using the indexer: cq[0]
(where 0 is the oldest item, next to be dequeued, last to be popped).Lists
Note: ArrayList
and LinkedList
can also function as Queue/Stacks
ArrayList<T>
: Similar to its counterpart in System.Collections.Generic (SCG)
, List<T>
, this is backed by an array, guaranteeing O(1) indexing, but worst-case O(n) insertion.O(n) to find an item.LinkedList<T>
: Similar to its counterpart SCG.LinkedList<T>
. Using a doubly-linked list, guarantees O(1) insertion, but worst-case O(n) indexing (in practice, is proportional to distance from either head or tail of the list). Also O(n) to find an item. Sorting uses a stable Merge Sort.HashedArrayList<T>
: Similar to the ArrayList<T>
above, but does not allow duplicates. The benefit you get in return is that the time to find an item and its index is reduced to O(1).HashedLinkedList<T>
: Similar to the LinkedList<T>
above, but does not allow duplicates. As before, the time to find an item is reduced to O(1), but time to find its index remains O(n).WrappedArray<T>
: Fairly similar to the ArrayList<T>
, this acts as a wrapper around an array that implements C5.IList<T>
, but throws exceptions if an attempt is made to modify the collection (IsFixedSize
is true and Add
, Remove
, Insert
don't work; Sort
, Shuffle
, and Reverse
do, however, as they are in-place operations).Lists also provide a "View" functionality which represents a segment of the underlying list, allowing local operations to be performed. Using patterns offered in the C5 book, operations can be performed using views that are efficient on both array and linked lists. Any list operation can also be performed on a view, restricting their effect to a subset of the underlying list.
Sorted Collections
SortedArray<T>
: Similar to an ArrayList<T>
except that it keeps its items sorted and does not allow duplicates. Note that random insertions and deletions on this collection are slow. This collection is best if the number of items is small or rarely modified but often accessed by item index or value.TreeSet<T>
: Uses a red-black tree structure to keep items sorted. As a set, it does not allow duplicates. Access by index or item value and insertion/deletion take O(log n).TreeBag<T>
: Uses a red-black tree, keeping items sorted. As a bag, it does allow duplicates, but does not store duplicates in the tree, rather keeping duplicates by counting.Both TreeSet<T>
and TreeBag<T>
provide the ability to efficiently make "snapshots" or persistent copies of the tree in O(1), allowing iteration over the snapshot while modifying the underlying tree. Note that each snapshot on a tree adds a performance penalty to updates to the tree, but these effects go away when the snapshot is disposed.
Hash Collections
HashSet<T>
: A collection using a simple hash table for storage. Access by item value takes O(1). As a set, it does not allow duplicates. Provides a function BucketCostDistribution()
that can help tell you determine the efficiency of the items' hashcode function.HashBag<T>
: Similar to the HashSet<T>
, but has bag semantics, meaning that duplicates are allowed, but duplicates are only stored by counting.Priority Queue
IntervalHeap<T>
: Provides a priority queue. Finding the maximum and minimum are O(1) operations, deleting the maximum, minimum, adding, and updating are O(log n) operations. Allows duplicates by storing them explicitly (not by counting).Dictionaries
HashDictionary<H,K>
: Similar to the SCG.Dictionary<H,K>
, provides entry access, insertion, and deletion in O(1). Also provides a BucketCostDistribution()
function as HashSet<T>
above. Does not guarantee any particular enumeration order.TreeDictionary<H,K>
: Similar to the SCG.SortedDictionary<H,K>
, provides a persistently sorted dictionary using a red-black tree. Entry access, insertion, and deletion take O(log n). Guarantees that enumeration of the dictionary follows the order specified by the key comparer.Guarded Collections
As well, C5 also offers "guarded" collections, which effectively acts as a read-only wrapper, preventing the collection from being modified. Items in the collection still may be modified, but items can't be added, deleted, or inserted into the collection.
A long answer, but thorough on the C5 libraries various collections at your disposal. I have found the C5 library to be great and often use it in my own code, replacing the common C# header with:
using C5; using SCG = System.Collections.Generic;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With