Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where can I learn about the various types of .NET lists?

Does anyone know a good resource to concisely explain the different types of lists available in C# and when their usage is appropriate?

For example, List, Hashtable, Dictionaries etc.

I'm never quite sure when I should be using what.

like image 410
Simon Keep Avatar asked Oct 13 '08 15:10

Simon Keep


People also ask

Can we store different data types in list in C#?

We can use an ArrayList class that is present in the System. Collections namespace,. Using it we can store different types because the ArrayList class operates on the object type.

What is the difference between list and collection in C#?

Collection - can add/remove items. List - allows items to have an order (accessing and removing by index)


2 Answers

These aren't all lists, although they're all collections. Here's a quick summary.

Non-generic collections (API is in terms of object. Values types are boxed.

These are mostly in the System.Collections namespace:

  • ArrayList: A list of items, backed by an array. Fast random read/write access. Fast add to the tail end, if the buffer doesn't need resizing.
  • Hashtable: Map from key to value. Keys are unique, values don't have to be. Uses the GetHashCode method to achieve near O(1) read/write access (aside from nasty cases where all items have the same hash, or the backing store needs rebuilding). Iterating over the key/value pairs gives an unpredictable order. (Well, effectively unpredictable.)
  • SortedList: Like a Hashtable, but the entries are always returned in sorted-by-key order. Stored as a list of key/value pairs.
  • Stack: Last-in-first-out collection
  • Queue: First-in-first-out collection
  • Array: Fixed-size O(1) random-access; non-generic, but has strongly typed forms as well

Generic collections. (Strongly-typed API, will not box value types (assuming suitable T).

These are mostly in the System.Collections.Generic namespace:

  • List<T>: Like ArrayList
  • Dictionary<TKey, TValue>: like Hashtable
  • SortedList<TKey, TValue>: like SortedList
  • SortedDictionary<TKey, TValue>: like SortedList, but stored as a tree of key/value pairs which gives better performance in many situations. See docs for more detail.
  • LinkedList<T>: Doubly linked list (fast access to head and tail)
  • Stack<T>: Like Stack
  • Queue<T>: Like Queue
  • ReadOnlyCollection<T>: Like List<T> but giving a read-only view

Possibly the most important collection interface is IEnumerable (and IEnumerable<T>). This represents a sequence of items much like a Stream represents a sequence of bytes. There is no random access, just forward-reading. LINQ to Objects is based on this, and pretty much all collection types implement it.

like image 124
Jon Skeet Avatar answered Sep 16 '22 22:09

Jon Skeet


To expound on tobsen's earlier answer, the C5 Generic Collection Library has a large number of, well, collections. I'll describe some of them here:

Queue/Stack

  • CircularQueue<T>: This class provides strictly Queue and Stack functionality. As well, efficient O(1) access to any item in the Stack/Queue is available using the indexer: cq[0] (where 0 is the oldest item, next to be dequeued, last to be popped).

Lists

Note: ArrayList and LinkedList can also function as Queue/Stacks

  • ArrayList<T>: Similar to its counterpart in System.Collections.Generic (SCG), List<T>, this is backed by an array, guaranteeing O(1) indexing, but worst-case O(n) insertion.O(n) to find an item.
  • LinkedList<T>: Similar to its counterpart SCG.LinkedList<T>. Using a doubly-linked list, guarantees O(1) insertion, but worst-case O(n) indexing (in practice, is proportional to distance from either head or tail of the list). Also O(n) to find an item. Sorting uses a stable Merge Sort.
  • HashedArrayList<T>: Similar to the ArrayList<T> above, but does not allow duplicates. The benefit you get in return is that the time to find an item and its index is reduced to O(1).
  • HashedLinkedList<T>: Similar to the LinkedList<T> above, but does not allow duplicates. As before, the time to find an item is reduced to O(1), but time to find its index remains O(n).
  • WrappedArray<T>: Fairly similar to the ArrayList<T>, this acts as a wrapper around an array that implements C5.IList<T>, but throws exceptions if an attempt is made to modify the collection (IsFixedSize is true and Add, Remove, Insert don't work; Sort, Shuffle, and Reverse do, however, as they are in-place operations).

Lists also provide a "View" functionality which represents a segment of the underlying list, allowing local operations to be performed. Using patterns offered in the C5 book, operations can be performed using views that are efficient on both array and linked lists. Any list operation can also be performed on a view, restricting their effect to a subset of the underlying list.

Sorted Collections

  • SortedArray<T>: Similar to an ArrayList<T> except that it keeps its items sorted and does not allow duplicates. Note that random insertions and deletions on this collection are slow. This collection is best if the number of items is small or rarely modified but often accessed by item index or value.
  • TreeSet<T>: Uses a red-black tree structure to keep items sorted. As a set, it does not allow duplicates. Access by index or item value and insertion/deletion take O(log n).
  • TreeBag<T>: Uses a red-black tree, keeping items sorted. As a bag, it does allow duplicates, but does not store duplicates in the tree, rather keeping duplicates by counting.

Both TreeSet<T> and TreeBag<T> provide the ability to efficiently make "snapshots" or persistent copies of the tree in O(1), allowing iteration over the snapshot while modifying the underlying tree. Note that each snapshot on a tree adds a performance penalty to updates to the tree, but these effects go away when the snapshot is disposed.

Hash Collections

  • HashSet<T>: A collection using a simple hash table for storage. Access by item value takes O(1). As a set, it does not allow duplicates. Provides a function BucketCostDistribution() that can help tell you determine the efficiency of the items' hashcode function.
  • HashBag<T>: Similar to the HashSet<T>, but has bag semantics, meaning that duplicates are allowed, but duplicates are only stored by counting.

Priority Queue

  • IntervalHeap<T>: Provides a priority queue. Finding the maximum and minimum are O(1) operations, deleting the maximum, minimum, adding, and updating are O(log n) operations. Allows duplicates by storing them explicitly (not by counting).

Dictionaries

  • HashDictionary<H,K>: Similar to the SCG.Dictionary<H,K>, provides entry access, insertion, and deletion in O(1). Also provides a BucketCostDistribution() function as HashSet<T> above. Does not guarantee any particular enumeration order.
  • TreeDictionary<H,K>: Similar to the SCG.SortedDictionary<H,K>, provides a persistently sorted dictionary using a red-black tree. Entry access, insertion, and deletion take O(log n). Guarantees that enumeration of the dictionary follows the order specified by the key comparer.

Guarded Collections

As well, C5 also offers "guarded" collections, which effectively acts as a read-only wrapper, preventing the collection from being modified. Items in the collection still may be modified, but items can't be added, deleted, or inserted into the collection.

A long answer, but thorough on the C5 libraries various collections at your disposal. I have found the C5 library to be great and often use it in my own code, replacing the common C# header with:

using C5; using SCG = System.Collections.Generic; 
like image 44
Marcus Griep Avatar answered Sep 17 '22 22:09

Marcus Griep