Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Enumerable.Range get iterated twice?

Tags:

c#

.net

linq

I am aware that IEnumerable is lazy, but I don't understand why Enumerable.Range gets iterated twice here:

using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication
{
    class Program
    {
        static int GetOne(int i)
        {
            return i / 2;
        }

        static IEnumerable<int> GetAll(int n)
        {
            var items = Enumerable.Range(0, n).Select((i) =>
            {
                Console.WriteLine("getting item: " + i);
                return GetOne(i);
            });

            var data = items.Select(item => item * 2);

            // data.Count does NOT causes another re-iteration 
            Console.WriteLine("first: items: " + data.Count());
            return data;
        }

        static void Main()
        {
            var data = GetAll(3);

            // data.Count DOES cause another re-iteration 
            Console.WriteLine("second: items: " + data.Count());
            Console.ReadLine();
        }
    }
}

Result:

getting item: 0
getting item: 1
getting item: 2
first: items: 3
getting item: 0
getting item: 1
getting item: 2
second: items: 3

Why does it not get re-iterated in "first" case, but does in the "second"?

like image 900
avo Avatar asked Feb 13 '23 19:02

avo


1 Answers

You are triggering a re-iteration on Count (which in order to supply the answer requires a full iteration of the source). An IEnumerable will never keep hold of it's values and will always re-iterate when it needs to.

On top of things like an Array or List<T> this isn't such a problem, but when the implementation is on a query, or over a complex yield return structure or some other set of code (such as Enumerable.Range), it can potentially get expensive.

This is why ReSharper does things like warn you of multiple enumeration.

If you need to remember the results of Count use a variable. If you want to guard against enumerating an expensive source, you tend to do things like var myCachedValues = myEnumerable.ToArray() and then shift onto iterating the array instead (thereby guaranteeing only one iteration).

If you want to go down the silly route (like I did) you could implement an enumerator that internally caches things in a list so you get any benefits of deferred execution and also the benefits of caching once iterated at least once. I called it IRepeatable. I was largely berated by my colleagues for this, but I'm stubborn.

like image 174
Adam Houldsworth Avatar answered Feb 23 '23 04:02

Adam Houldsworth