Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we need two interfaces to enumerate a collection?

It is quite a while that I have been trying to understand the idea behind IEnumerable and IEnumerator. I read all the questions and answers I could find over the net, and on StackOverflow in particular, but I am not satisfied. I got to the point where I understand how those interfaces should be used, but not why they are used this way.

I think that the essence of my misunderstanding is that we need two interfaces for one operation. I realized that if both are needed, one was probably not enough. So I took the "hard coded" equivalent of foreach (as I found here):

while (enumerator.MoveNext())
{
    object item = enumerator.Current;

    // logic
}

and tried to get it to work with one interface, thinking something would go wrong which would make me understand why another interface is needed.

So I created a collection class, and implemented IForeachable:

class Collection : IForeachable
{
    private int[] array = { 1, 2, 3, 4, 5 };
    private int index = -1;

    public int Current => array[index];

    public bool MoveNext()
    {
        if (index < array.Length - 1)
        {
            index++;
            return true;
        }

        index = -1;
        return false;
    }
}

and used the foreach equivalent to nominate the collection:

var collection = new Collection();

while (collection.MoveNext())
{
    object item = collection.Current;

    Console.WriteLine(item);
}

And it works! So what is missing here that make another interface required?

Thanks.


Edit: My question is not a duplicate of the questions listed in the comments:

  • This question is why interfaces are needed for enumerating in the first place.
  • This question and this question are about what are those interfaces and how should they be used.

My question is why they are designed the way they are, not what are they, how they work, and why do we need them in the first place.

like image 302
Michael Haddad Avatar asked Mar 31 '17 08:03

Michael Haddad


2 Answers

What are the two interfaces and what do they do?

The IEnumerable interface is placed on the collection object and defines the GetEnumerator() method, this returns a (normally new) object that has implements the IEnumerator interface. The foreach statement in C# and For Each statement in VB.NET use IEnumerable to access the enumerator in order to loop over the elements in the collection.

The IEnumerator interface is esentially the contract placed on the object that actually does the iteration. It stores the state of the iteration and updates it as the code moves through the collection.

Why not just have the collection be the enumerator too? Why have two separate interfaces?

There is nothing to stop IEnumerator and IEnumerable being implemented on the same class. However, there is a penalty for doing this – It won’t be possible to have two, or more, loops on the same collection at the same time. If it can be absolutely guaranteed that there won’t ever be a need to loop on the collection twice at the same time then that’s fine. But in the majority of circumstances that isn’t possible.

When would someone iterate over a collection more than once at a time?

Here are two examples.

The first example is when there are two loops nested inside each other on the same collection. If the collection was also the enumerator then it wouldn’t be possible to support nested loops on the same collection, when the code gets to the inner loop it is going to collide with the outer loop.

The second example is when there are two, or more, threads accessing the same collection. Again, if the collection was also the enumerator then it wouldn’t be possible to support safe multithreaded iteration over the same collection. When the second thread attempts to loop over the elements in the collection the state of the two enumerations will collide.

Also, because the iteration model used in .NET does not permit alterations to a collection during enumeration these operations are otherwise completely safe.

-- This was from a blog post I wrote many years ago: https://colinmackay.scot/2007/06/24/iteration-in-net-with-ienumerable-and-ienumerator/

like image 188
Colin Mackay Avatar answered Oct 09 '22 19:10

Colin Mackay


Your IForeachable cannot even be iterated from two different threads (you cannot have multiple active iterations at all - even from the same thread), because current enumeration state stored in IForeachable itself. You also have to reset your current position each time you finished enumeration, and if you forgot to do that - well, next caller will think your collection is empty. I can only imagine all kind of hard to track bugs this all might lead to.

On the other hand, because IEnumerable returns new IEnumerator for each caller - you can have multiple enumerations in progress simultaneously, because each caller has it's own enumeration state. I think this reason alone is enough to justify two interfaces. Enumeration is essentially read operation, and it would have been very confusing if you cannot read the same thing simultaneously in multiple places.

like image 36
Evk Avatar answered Oct 09 '22 20:10

Evk