Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading an IEnumerable multiple times

Let's say I have some code:

var items = ItemsGetter.GetAllItems().Where(x => x.SomeProperty > 20);
int sum1 = items.Sum(x => x.SomeFlag == true);

And for example I need some other sum from the items collection later in the code.

int sum2 = items.Sum(x => x.OtherFlag == false);

So my question: Is it OK to call Linq methods on IEnumerable more than once? Maybe I should call Reset() method on enumerator or make list from items using ToList method?

like image 218
Aleksandr Ivanov Avatar asked Sep 30 '11 15:09

Aleksandr Ivanov


3 Answers

Well, it really depends what you want to do. You could take the hit of executing the query twice (and the exact meaning of that will depend on what GetAllItems() does), or you could take the hit of copying the results to a list:

var items = ItemsGetter.GetAllItems().Where(x => x.SomeProperty > 20).ToList();

Once it's in a list, obviously it's not a problem to iterate over that list multiple times.

Note that you can't call Reset because you don't have the iterator - you have the IEnumerable<T>. I wouldn't recommend calling IEnumerator<T> in general anyway - many implementations (including any generated by the C# compiler from iterator blocks) don't actually implement Reset anyway (i.e. they throw an exception).

like image 102
Jon Skeet Avatar answered Oct 20 '22 00:10

Jon Skeet


I'm occasionally in the situation that I have to process an enumerable multiple times. If enumerating is expensive, non-repeatable and yields a lot of data (like a IQueryable that reads from a database), enumerating multiple times is not an option, neither is buffering the result in memory.

Until today I often ended up writing aggregator classes into which I could push items in a foreach loop and eventually read the results out - much less elegant than LINQ is.

But wait, did I just say "push"? Doesn't that sound like... reactive? So I was thinking during tonight's walk. Back home I tried it - and it works!

The example snippet shows how to get both the minimum and maximum items from a sequence of integers in a single pass, using standard LINQ operators (those of Rx, that is):

public static MinMax GetMinMax(IEnumerable<int> source)
{
    // convert source to an observable that does not enumerate (yet) when subscribed to
    var connectable = source.ToObservable(Scheduler.Immediate).Publish();

    // set up multiple consumers
    var minimum = connectable.Min();
    var maximum = connectable.Max();

    // combine into final result
    var final = minimum.CombineLatest(maximum, (min, max) => new MinMax { Min = min, Max = max });

    // make final subscribe to consumers, which in turn subscribe to the connectable observable
    var resultAsync = final.GetAwaiter();

    // now that everybody is listening, enumerate!
    connectable.Connect();

    // result available now
    return resultAsync.GetResult();
}
like image 24
tinudu Avatar answered Oct 19 '22 23:10

tinudu


LINQ uses deferred execution, so 'items' will only enumerate when you request it to via another method. Each of your Sum methods will take O(n) to iterate through. Depending on how large your items list is, you may not want to iterate over it multiple times.

like image 2
Stealth Rabbi Avatar answered Oct 20 '22 00:10

Stealth Rabbi