Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Enumerator.MoveNext() throws 'Collection was Modified' on first call

Consider the following code:

List<int> list = new List<int>();
IEnumerable<int> enumerable = list;
IEnumerator<int> enumerator = enumerable.GetEnumerator();
list.Add(1);
bool any = enumerator.MoveNext();

At runtime, the last line throws an:

InvalidOperationException: Collection was modified; enumeration operation may not execute.

I understand the need for IEnumerators to throw 'Collection was modified' exceptions when the IEnumerable changes, but I don't understand this:

Why does the IEnumerator throw this exception on the first call of MoveNext()? Since the IEnumerator doesn't represent the state of the IEnumerable until MoveNext() is called for the first time, why can't it start tracking changes from the first MoveNext() instead of from GetEnumerator()?

like image 619
Nathan Avatar asked Dec 09 '22 23:12

Nathan


2 Answers

Probably because the rule "an Enumerator is invalidated if the underlying collection is modified" is simpler than the rule "an Enumerator is invalidated if the underlying collection is modified after the first call to MoveNext". Or it's just the way it's implemented. Plus, it's just reasonable to assume that an Enumerator represents the state of the underlying collection at the moment the Enumerator was created, and relying on a different behavior is likely to be a source of bugs.

like image 144
Thom Smith Avatar answered Dec 11 '22 11:12

Thom Smith


I feel like a quick recap of iterators is needed.

An iterator (IEnumerator and IEnumerable for C#) is used to access elements of a structure in an ordered manner without exposing the underlying representation. The consequence is that it allows you to have extemely generic functions such as the following.

void Iterator<T, V>(T collection, Action<V> actor) where T : IEnumerable<V>
{
    foreach (V value in collection)
        actor(value);
}

//Or the more verbose way
void Iterator<T, V>(T collection, Action<V> actor) where T : IEnumerable<V>
{
    using (var iterator = collection.GetEnumerator())
    {
        while (iterator.MoveNext())
            actor(iterator.Current);
    }
}

//Or if you need to support non-generic collections (ArrayList, Queue, BitArray, etc)
void Iterator<T, V> (T collection, Action<V> actor) where T : IEnumerable
{
    foreach (object value in collection)
        actor((V)value);
}

There are trade-offs, as can be seen in the C# specification.

5.3.3.16 Foreach statements

foreach ( type identifier in expr ) embedded-statement

  • The definite assignment state of v at the beginning of expr is the same as the state of v at the beginning of stmt.

  • The definite assignment state of v on the control flow transfer to embedded-statement or to the end point of stmt is the same as the state of v at the end of expr.

Which simply means that values are read-only. Why are they read-only? It's simple. Since foreach is such a high level statement, it can't and won't assume anything about the container you are iterating over. What if you were iterating over a binary tree and decided to randomly assign values inside the foreach statement. If foreach didn't force read-only access then your binary tree would degenerate into a tree. The entire data structure would be in disarray.

But this wasn't your original question. You were modifying the collection before you even accessed the first element and an error was thrown. Why? For this, I dug into the List class using ILSpy. Here's a snippet of the List class

public class List<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable
{
    private int _version;

    public struct Enumerator : IEnumerator<T>, IDisposable, IEnumerator
    {
        private List<T> list;
        private int version;
        private int index;

        internal Enumerator(List<T> list)
        {
            this.list = list;
            this.version = list._version;
            this.index = 0;
        }

        /* All the implemented functions of IEnumerator<T> and IEnumerator will throw 
           a ThrowInvalidOperationException if (this.version != this.list._version) */
    }
}

The enumerator is initialized with the parent list's "version" and a reference to the parent list. All iterating operations check to make sure that the initial version is equivalent to the referenced list's current version. If they are out of synch, the iterator is no longer valid. Why does the BCL do this? Why didn't the implementers check if the index of the enumerator was 0 (representing a new enumerator), and if it was, simply resynch the versions? I'm not sure. I can only hypothesize that the team wanted conformity amongst all the classes that implemented IEnumerable and they also wanted to keep it simple. Therefore a List's enumerator (and I believe most others) do not discriminate between elements as long as they are within range.

This is the root cause of your problem. If you absolutely must have this functionality, then you will have to implement your own iterator and you may end up having to implement your own List. In my opinion, way too much work to against the flow of the BCL.

Here's a quote from the GoF when designing an iterator that the BCL team probably followed:

It can be dangerous to modify an aggregate while you're traversing it. If elements are added or deleted from the aggregate, you might end up accessing an element twice or missing it completely. A simple solution is to copy the aggregate and traverse the copy, but that's too expensive to do in general

The BCL team most likely decided that it was too expensive in space-time complexity and manpower. And this philosophy is seen throughout C#. It is probably too expensive to allow modification of variables inside a foreach, too expensive to have List's Enumerator discriminate where it's in the list, and too expensive to cradle the user. Hopefully I've explained it well enough that one can seen the power and the constraint of iterators.

Reference:

What changes the "version" of a list and thus invalidates all current enumerators?

  • Changing an element through the indexer
  • Add
  • AddRange
  • Clear
  • Insert
  • InsertRange
  • RemoveAll
  • RemoveAt
  • RemoveRange
  • Reverse
  • Sort
like image 32
Nick Babcock Avatar answered Dec 11 '22 12:12

Nick Babcock