Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ - Writing an extension method to get the row with maximum value for each group

My application frequently needs to group a table, then return the row with the maximum value for that group. This is pretty easy to do in LINQ:

myTable.GroupBy(r => r.FieldToGroupBy)
.Select(r => r.Max(s => s.FieldToMaximize))
.Join(
    myTable,
    r => r,
    r => r.FieldToMaximize,
    (o, i) => i)

Now suppose I want to abstract this out into its own method. I tried writing this:

public static IQueryable<TSource>
SelectMax<TSource, TGroupKey, TMaxKey>(
    this IQueryable<TSource> source,
    Expression<Func<TSource, TGroupKey>> groupKeySelector,
    Expression<Func<TSource, TMaxKey>> maxKeySelector)
    where TMaxKey : IComparable
{
    return source
        .GroupBy(groupKeySelector)
        .Join(
            source,
            g => g.Max(maxKeySelector),
            r => maxKeySelector(r),
                (o, i) => i);
}

Unfortunately this doesn't compile: maxKeySelector is an expression (so you can't call it on r, and you can't even pass it to Max. So I tried rewriting, making maxKeySelector a function rather than an expression:

public static IQueryable<TSource>
SelectMax<TSource, TGroupKey, TMaxKey>(
    this IQueryable<TSource> source,
    Expression<Func<TSource, TGroupKey>> groupKeySelector,
    Func<TSource, TMaxKey> maxKeySelector)
    where TMaxKey : IComparable
{
    return source
        .GroupBy(groupKeySelector)
        .Join(
            source,
            g => g.Max(maxKeySelector),
            r => maxKeySelector(r),
                (o, i) => i);
}

Now this compiles. But it fails at runtime: "Unsupported overload used for query operator 'Max'." This is what I'm stuck on: I need to find the right way to pass maxKeySelector into Max().

Any suggestions? I'm using LINQ to SQL, which seems to make a difference.

like image 345
ctkrohn Avatar asked Aug 18 '11 22:08

ctkrohn


2 Answers

First of all, I'd like to point out that what you're trying to do is even easier than you think in LINQ:

myTable.GroupBy(r => r.FieldToGroupBy)
    .Select(g => g.OrderByDescending(r => r.FieldToMaximize).FirstOrDefault())

... which should make our lives a little easier for the second part:

public static IQueryable<TSource>
SelectMax<TSource, TGroupKey, TMaxKey>(
    this IQueryable<TSource> source,
    Expression<Func<TSource, TGroupKey>> groupKeySelector,
    Expression<Func<TSource, TMaxKey>> maxKeySelector)
    where TMaxKey : IComparable
{
    return source
        .GroupBy(groupKeySelector)
        .Select(g => g.AsQueryable().OrderBy(maxKeySelector).FirstOrDefault());
}

The key is that by making your group an IQueryable, you open up a new set of LINQ methods that can take actual expressions rather than taking Funcs. This should be compatible with most standard LINQ providers.

like image 93
StriplingWarrior Avatar answered Sep 28 '22 04:09

StriplingWarrior


Very interesting. Sometimes "dynamic" can cost you more in sheer development and runtime execution than it is worth (IMHO). Nonetheless, here's the easiest:

public static IQueryable<Item> _GroupMaxs(this IQueryable<Item> list)
{
    return list.GroupBy(x => x.Family)
        .Select(g => g.OrderByDescending(x => x.Value).First());
}

And, here's the most dynamic approach:

public static IQueryable<T> _GroupMaxs<T, TGroupCol, TValueCol>
    (this IQueryable<T> list, string groupColName, string valueColName)
{
    // (x => x.groupColName)
    var _GroupByPropInfo = typeof(T).GetProperty(groupColName);
    var _GroupByParameter = Expression.Parameter(typeof(T), "x");
    var _GroupByProperty = Expression
            .Property(_GroupByParameter, _GroupByPropInfo);
    var _GroupByLambda = Expression.Lambda<Func<T, TGroupCol>>
        (_GroupByProperty, new ParameterExpression[] { _GroupByParameter });

    // (x => x.valueColName)
    var _SelectParameter = Expression.Parameter(typeof(T), "x");
    var _SelectProperty = Expression
            .Property(_SelectParameter, valueColName);
    var _SelectLambda = Expression.Lambda<Func<T, TValueCol>>
        (_SelectProperty, new ParameterExpression[] { _SelectParameter });

    // return list.GroupBy(x => x.groupColName)
    //   .Select(g => g.OrderByDescending(x => x.valueColName).First());
    return list.GroupBy(_GroupByLambda)
        .Select(g => g.OrderByDescending(_SelectLambda.Compile()).First());
}

As you can see, I precede my Extension Methods with an underscore. You do not need to do this, of course. Just take the general idea and run with it.

You would call it like this:

public class Item
{
    public string Family { get; set; }
    public int Value { get; set; }
}

foreach (Item item in _List
        .AsQueryable()._GroupMaxs<Item, String, int>("Family", "Value"))
    Console.WriteLine("{0}:{1}", item.Family, item.Value);

Best of luck!

like image 36
Jerry Nixon Avatar answered Sep 28 '22 02:09

Jerry Nixon