Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large LINQ Grouping query, what's happening behind the scenes

Take the following LINQ query as an example. Please don't comment on the code itself as I've just typed it to help with this question.

The following LINQ query uses a 'group by' and calculates summary information. As you can see there are numerous calculations which are being performed on the data but how efficient is LINQ behind the scenes.

var NinjasGrouped = (from ninja in Ninjas 
    group pos by new { pos.NinjaClan, pos.NinjaRank } 
    into con 
    select new NinjaGroupSummary 
    { 
        NinjaClan = con.Key.NinjaClan, 
        NinjaRank = con.Key.NinjaRank, 
        NumberOfShoes = con.Sum(x => x.Shoes), 
        MaxNinjaAge = con.Max(x => x.NinjaAge), 
        MinNinjaAge = con.Min(x => x.NinjaAge), 
        ComplicatedCalculation = con.Sum(x => x.NinjaGrade) != 0 
        ? con.Sum(x => x.NinjaRedBloodCellCount)/con.Sum(x => x.NinjaDoctorVisits)
        : 0,
    ListOfNinjas = con.ToList() 
    }).ToList(); 
  1. How many times is the list of 'Ninjas' being iterated over in order to calculate each of the values?
  2. Would it be faster to employ a foreach loop to speed up the execution of such a query?
  3. Would adding '.AsParallel()' after Ninjas result in any performance improvements?
  4. Is there a better way of calculating summery information for List?

Any advice is appreciated as we use this type of code throughout our software and I would really like to gain a better understanding of what LINQ is doing underneath the hood (so to speak). Perhaps there is a better way?

like image 914
Belinda Avatar asked Sep 12 '11 22:09

Belinda


People also ask

What does LINQ GroupBy return?

GroupBy & ToLookup return a collection that has a key and an inner collection based on a key field value. The execution of GroupBy is deferred whereas that of ToLookup is immediate. A LINQ query syntax can be end with the GroupBy or Select clause.

How does GroupBy work LINQ?

The working of the GroupBy operator is similar to the SQL GroupBy clause. It is used to return the group of elements which share the common attributes or key from the given sequence or collection. Every group is represented by IGrouping<TKey, TElement> object.

How many types of LINQ are there?

There are two syntaxes of LINQ.

What is fluent LINQ?

In LINQ, the 'fluent' method syntax flows logically and intuitively, and allows them to be combined simply, because each method returns the appropriate type of object for the next.


1 Answers

Assuming this is a LINQ to Objects query:

  • Ninjas is only iterated over once; the groups are built up into internal concrete lists, which you're then iterating over multiple times (once per aggregation).
  • Using a foreach loop almost certainly wouldn't speed things up - you might benefit from cache coherency a bit more (as each time you iterate over a group it'll probably have to fetch data from a higher level cache or main memory) but I very much doubt that it would be significant. The increase in pain in implementing it probably would be significant though :)
  • Using AsParallel might speed things up - it looks pretty easily parallelizable. Worth a try...
  • There's not a much better way for LINQ to Objects, to be honest. It would be nice to be able to perform the aggregation as you're grouping, and Reactive Extensions would allow you to do something like that, but for the moment this is probably the simplest approach.

You might want to have a look at the GroupBy post in my Edulinq blog series for more details on a possible implementation.

like image 143
Jon Skeet Avatar answered Oct 09 '22 19:10

Jon Skeet