Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Workaround needed for EF Core performing GroupBy operations in memory instead of in SQL

I'm working in Entity Framework Core 1.1.0 (and upgrading is not an option at this point, due to breaking changes in later versions). My query is of the following form:

var q = db.MyTable
            .GroupBy(t => new { t.Field1 })
            .Select(g => new
            {
                g.Key.Field1,
                MaxField2 = g.Max(x => x.Field2)
            })
            .ToList();

In test code this works well and returns the expected data. But when deployed to a real environment, with real data, it times out. Why? Well, I put a sniffer on the SQL server, and here's the actual SQL:

SELECT [t].[Field1], [t].[Field2], [t].[Field3], [t].[Field4], [t].[Field5]
FROM [dbo].[MyTable] AS [t]
ORDER BY [t].[Field1]

Oh. Well that would explain it. EF is only compiling the query up to the .GroupBy() into SQL, thus attempting to load the entire contents of the table (some 17 million records at this time of writing) into memory, and the rest of the grouping and ordering is supposed to be done in memory.

Any suggestions how to rework this query so that the heavy lifting is done in SQL?

like image 632
Shaul Behr Avatar asked Dec 23 '22 16:12

Shaul Behr


1 Answers

As @xanatos points out, this is is not supported in EF Core 1.1.0 (and not even 2.0.0). There is, however, a workaround, using literal SQL:

var q = db.MyTable
        .FromSql("select t.* from " +
                 "  (select distinct Field1 from MyTable) t0 " +
                 "cross apply " +
                 "  (select top 1 t.* from MyTable t " +
                 "  where t.Field1 = t0.Field1 " +
                 "  order by t.Field2 desc) t")                     
        .Select(t => new
        {
            t.Field1,
            MaxField2 = t.Field2
        })
        .ToList();

Not the solution I'd hoped for, but it works a charm.

like image 59
Shaul Behr Avatar answered Dec 29 '22 07:12

Shaul Behr