Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faking IGrouping for LINQ

Imagine you have a large dataset that may or may not be filtered by a particular condition of the dataset elements that can be intensive to calculate. In the case where it is not filtered, the elements are grouped by the value of that condition - the condition is calculated once.

However, in the case where the filtering has taken place, although the subsequent code still expects to see an IEnumerable<IGrouping<TKey, TElement>> collection, it doesn't make sense to perform a GroupBy operation that would result in the condition being re-evaluated a second time for each element. Instead, I would like to be able to create an IEnumerable<IGrouping<TKey, TElement>> by wrapping the filtered results appropriately, and thus avoiding yet another evaluation of the condition.

Other than implementing my own class that provides the IGrouping interface, is there any other way I can implement this optimization? Are there existing LINQ methods to support this that would give me the IEnumerable<IGrouping<TKey, TElement>> result? Is there another way that I haven't considered?

like image 759
Jeff Yates Avatar asked Jan 22 '26 21:01

Jeff Yates


2 Answers

the condition is calculated once

I hope those keys are still around somewhere...

If your data was in some structure like this:

public class CustomGroup<T, U>
{
  T Key {get;set;}
  IEnumerable<U> GroupMembers {get;set} 
}

You could project such items with a query like this:

var result = customGroups
  .SelectMany(cg => cg.GroupMembers, (cg, z) => new {Key = cg.Key, Value = z})
  .GroupBy(x => x.Key, x => x.Value)
like image 92
Amy B Avatar answered Jan 25 '26 09:01

Amy B


Inspired by David B's answer, I have come up with a simple solution. So simple that I have no idea how I missed it.

In order to perform the filtering, I obviously need to know what value of the condition I am filtering by. Therefore, given a condition, c, I can just project the filtered list as:

filteredList.GroupBy(x => c)

This avoids any recalculation of properties on the elements (represented by x).

Another solution I realized would work is to revers the ordering of my query and perform the grouping before I perform the filtering. This too would mean the conditions only get evaluated once, although it would unnecessarily allocate groupings that I wouldn't subsequently use.

like image 28
Jeff Yates Avatar answered Jan 25 '26 09:01

Jeff Yates