Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is collection semantics (in .NET)?

I need to maintain collection semantics in my own class. Couldn't you explain, what is the collection semantics? As I understand, it is a set of some interfaces that must be implemented in a class. Is it true? And if yes – what exactly must I implement in the class and why? Are these two interfaces – ICollection and IEnumerable – enough, or these are only the most necessary ones?

I am programming a circular linked list using this article as help.

like image 808
Mstislav Toivonen Avatar asked Dec 15 '22 19:12

Mstislav Toivonen


1 Answers

There are many collection types in .NET, and they all share some common behavior, for instance:

  • You can enumerate them using foreach
  • They have a Count property
  • You can add items using the Add method
  • etc...

This behavior is expected from a collection type, and you guessed it right: it's all in the ICollection<T> interface. Let's take a look at the interface hierarchy:

  • IEnumerable<T> allows your class to be enumerated with foreach
  • ICollection<T> is an IEnumerable<T> that represents a collection:
    • it allows to retrieve the item Count
    • you possibly can Add/Remove/Clear items from the collection
    • a collection could be read only, in that case IsReadOnly should return true
    • There's a couple other helper methods too: Contains/CopyTo.
  • IList<T> is an ICollection<T> that allows items to be accessed by index.
    • It adds an indexer
    • some index-related functions: Insert/RemoveAt
    • IndexOf

Which interface you should implement is a matter of semantics:

IEnumerable<T> is just an enumerable sequence. It should only be enumerated once by consuming code, because you never know how it may behave on multiple enumerations. Tools like ReSharper will even emit warnings if you enumerate an IEnumerable<T> multiple times.
Sure, most of the time you can safely enumerate it multiple times but there are times when you shouldn't. For instance, an enumeration could execute a SQL query (think Linq-to-SQL for instance).

You implement an IEnumerable<T> by defining one function: GetEnumerator which returns en IEnumerator<T>. An enumerator is an object that is a kind of pointer to a current element in your sequence. It can return this Current value, and it can move to the next element with MoveNext. It's also disposable (and it's disposed at the end of the enumeration by foreach).

Let's decompose a foreach loop:

IEnumerable<T> sequence = ... // Whatever
foreach (T item in sequence)
    DoSomething(item);

This is equivalent to the following:

IEnumerator<T> enumerator = null;
try
{
    enumerator = sequence.GetEnumerator();
    while (enumerator.MoveNext())
    {
        T item = enumerator.Current;
        DoSomething(item);
    }
}
finally
{
    if (enumerator != null)
        enumerator.Dispose();
}

For the record, implementing IEnumerable is not strictly required to make a class usable with a foreach. Duck typing is sufficient here but I'm digressing way too much.

And of course you can implement the pattern easily with the yield keyword:

public static IEnumerable<int> GetAnswer()
{
    yield return 42;
}

This will create a private class which will implement IEnumerable<int> for you so you don't have to.

ICollection<T> represents a collection, which is safely enumerable multiple times. But you don't really know what kind of collection it it. It could be a set, a list, a dictionary, whatever.

This is the collection semantics.

Some examples:

  • T[] - it implements ICollection<T> even if you can't Add/Remove
  • List<T>
  • HashSet<T> - a good example of a collection but not a list
  • Dictionary<TKey, TValue> - yes, that's an ICollection<KeyValuePair<TKey, TValue>>
  • LinkedList<T>
  • ObservableCollection<T>

IList<T> lets you know the collection is of the kind that lets you access elements by index easily (that is in O(1) time).

This is not the case for your circular linked list, as not only it would need O(n) time, but there's no meaningful index in the first place.

Some examples:

  • T[]
  • List<T>
  • ObservableCollection<T>

Notice that HashSet<T> and Dictionary<TKey, TValue> are no longer in the list for instance. These are not lists. LinkedList<T> is semantically a list, but it doesn't offer access by index in O(1) time (it requires O(n)).

I should mention there are read only equivalents that made in into .NET 4.5: IReadOnlyCollection<out T>, IReadOnlyList<out T>. These are nice for the covariance they provide.

like image 78
Lucas Trzesniewski Avatar answered Dec 30 '22 16:12

Lucas Trzesniewski