Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Entity Framework: Efficiently grouping by month

I've done a bit of research on this, and the best I've found so far is to use an Asenumerable on the whole dataset, so that the filtering occurs in linq to objects rather than on the DB. I'm using the latest EF.

My working (but very slow) code is:

        var trendData = 
            from d in ExpenseItemsViewableDirect.AsEnumerable()
            group d by new {Period = d.Er_Approved_Date.Year.ToString() + "-" + d.Er_Approved_Date.Month.ToString("00") } into g
            select new
            {
                Period = g.Key.Period,
                Total = g.Sum(x => x.Item_Amount),
                AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
            };

This gives me months in format YYYY-MM, along with the total amount and average amount. However it takes several minutes every time.

My other workaround is to do an update query in SQL so I have a YYYYMM field to group natively by. Changing the DB isn't an easy fix however so any suggestions would be appreciated.

The thread I found the above code idea (http://stackoverflow.com/questions/1059737/group-by-weeks-in-linq-to-entities) mentions 'waiting until .NET 4.0'. Is there anything recently introduced that helps in this situation?

like image 785
Glinkot Avatar asked Mar 27 '12 08:03

Glinkot


3 Answers

The reason for poor performance is that the whole table is fetched into memory (AsEnumerable()). You can group then by Year and Month like this

var trendData = 
            (from d in ExpenseItemsViewableDirect
            group d by new {
                            Year = d.Er_Approved_Date.Year, 
                            Month = d.Er_Approved_Date.Month 
                            } into g
            select new
            {
                Year = g.Key.Year,
                Month = g.Key.Month,
                Total = g.Sum(x => x.Item_Amount),
                AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
            }
       ).AsEnumerable()
        .Select(g=>new {
              Period = g.Year + "-" + g.Month,
              Total = g.Total,
               AveragePerTrans = g.AveragePerTrans
         });

edit

The original query, from my response, was trying to do a concatenation between an int and a string, which is not translatable by EF into SQL statements. I could use SqlFunctions class, but the query it gets kind ugly. So I added AsEnumerable() after the grouping is made, which means that EF will execute the group query on server, will get the year, month, etc, but the custom projection is made over objects (what follows after AsEnumerable()).

like image 78
Adrian Iftode Avatar answered Nov 20 '22 02:11

Adrian Iftode


When it comes to group by month i prefer to do this task in this way:

var sqlMinDate = (DateTime) SqlDateTime.MinValue;

var trendData = ExpenseItemsViewableDirect
    .GroupBy(x => SqlFunctions.DateAdd("month", SqlFunctions.DateDiff("month", sqlMinDate, x.Er_Approved_Date), sqlMinDate))
    .Select(x => new
    {
        Period = g.Key // DateTime type
    })

As it keeps datetime type in the grouping result.

like image 27
cryss Avatar answered Nov 20 '22 02:11

cryss


Similarly to what cryss wrote, I am doing the following for EF. Note we have to use EntityFunctions to be able to call all DB providers supported by EF. SqlFunctions only works for SQLServer.

var sqlMinDate = (DateTime) SqlDateTime.MinValue; 

(from x in ExpenseItemsViewableDirect
let month = EntityFunctions.AddMonths(sqlMinDate, EntityFunctions.DiffMonths(sqlMinDate, x.Er_Approved_Date))
group d by month 
into g
select new
{
Period = g.Key,
   Total = g.Sum(x => x.Item_Amount),
   AveragePerTrans = Math.Round(g.Average(x => x.Item_Amount),2)
}).Dump();

A taste of generated SQL (from a similar schema):

-- Region Parameters
DECLARE @p__linq__0 DateTime2 = '1753-01-01 00:00:00.0000000'
DECLARE @p__linq__1 DateTime2 = '1753-01-01 00:00:00.0000000'
-- EndRegion
SELECT 
1 AS [C1], 
[GroupBy1].[K1] AS [C2], 
[GroupBy1].[A1] AS [C3]
FROM ( SELECT 
    [Project1].[C1] AS [K1], 
    FROM ( SELECT 
        DATEADD (month, DATEDIFF (month, @p__linq__1, [Extent1].[CreationDate]), @p__linq__0) AS [C1]
        FROM [YourTable] AS [Extent1]
    )  AS [Project1]
    GROUP BY [Project1].[C1]
)  AS [GroupBy1]
like image 2
Boris Avatar answered Nov 20 '22 03:11

Boris