Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Unzip" IEnumerable dynamically in C# or best alternative

Lets assume you have a function that returns a lazily-enumerated object:

struct AnimalCount
{
    int Chickens;
    int Goats;
}

IEnumerable<AnimalCount> FarmsInEachPen()
{
    ....
    yield new AnimalCount(x, y);
    ....
}

You also have two functions that consume two separate IEnumerables, for example:

ConsumeChicken(IEnumerable<int>);
ConsumeGoat(IEnumerable<int>);

How can you call ConsumeChicken and ConsumeGoat without a) converting FarmsInEachPen() ToList() beforehand because it might have two zillion records, b) no multi-threading.

Basically:

ConsumeChicken(FarmsInEachPen().Select(x => x.Chickens));
ConsumeGoats(FarmsInEachPen().Select(x => x.Goats));

But without forcing the double enumeration.

I can solve it with multithread, but it gets unnecessarily complicated with a buffer queue for each list.

So I'm looking for a way to split the AnimalCount enumerator into two int enumerators without fully evaluating AnimalCount. There is no problem running ConsumeGoat and ConsumeChicken together in lock-step.

I can feel the solution just out of my grasp but I'm not quite there. I'm thinking along the lines of a helper function that returns an IEnumerable being fed into ConsumeChicken and each time the iterator is used, it internally calls ConsumeGoat, thus executing the two functions in lock-step. Except, of course, I don't want to call ConsumeGoat more than once..

like image 666
Mahmoud Al-Qudsi Avatar asked Mar 28 '13 19:03

Mahmoud Al-Qudsi


2 Answers

I don't think there is a way to do what you want, since ConsumeChickens(IEnumerable<int>) and ConsumeGoats(IEnumerable<int>) are being called sequentially, each of them enumerating a list separately - how do you expect that to work without two separate enumerations of the list?

Depending on the situation, a better solution is to have ConsumeChicken(int) and ConsumeGoat(int) methods (which each consume a single item), and call them in alternation. Like this:

foreach(var animal in animals)
{
    ConsomeChicken(animal.Chickens);
    ConsomeGoat(animal.Goats);
}

This will enumerate the animals collection only once.


Also, a note: depending on your LINQ-provider and what exactly it is you're trying to do, there may be better options. For example, if you're trying to get the total sum of both chickens and goats from a database using linq-to-sql or linq-to-entities, the following query..

from a in animals
group a by 0 into g
select new 
{
    TotalChickens = g.Sum(x => x.Chickens), 
    TotalGoats = g.Sum(x => x.Goats)
}

will result in a single query, and do the summation on the database-end, which is greatly preferable to pulling the entire table over and doing the summation on the client end.

like image 102
BlueRaja - Danny Pflughoeft Avatar answered Oct 03 '22 15:10

BlueRaja - Danny Pflughoeft


The way you have posed your problem, there is no way to do this. IEnumerable<T> is a pull enumerable - that is, you can GetEnumerator to the front of the sequence and then repeatedly ask "Give me the next item" (MoveNext/Current). You can't, on one thread, have two different things pulling from the animals.Select(a => a.Chickens) and animals.Select(a => a.Goats) at the same time. You would have to do one then the other (which would require materializing the second).

The suggestion BlueRaja made is one way to change the problem slightly. I would suggest going that route.

The other alternative is to utilize IObservable<T> from Microsoft's reactive extensions (Rx), a push enumerable. I won't go into the details of how you would do that, but it's something you could look into.

Edit:

The above is assuming that ConsumeChickens and ConsumeGoats are both returning void or are at least not returning IEnumerable<T> themselves - which seems like an obvious assumption. I'd appreciate it if the lame downvoter would actually comment.

like image 39
Timothy Shields Avatar answered Oct 03 '22 16:10

Timothy Shields