Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Entity Framework GroupBy take the oldest with mySQL

I have a huge list of Items and need to Group them by one property. Then the oldest of each group should be selected.

Simplified Example: Select the oldest User of each FirstName.

using (ED.NWEntities ctx = new ED.NWEntities())
{
    IQueryable<ED.User> Result = ctx.User.GroupBy(x => x.FirstName)
                                    .Select(y => y.OrderBy(z => z.BirthDate)
                                    .FirstOrDefault())
                                    .AsQueryable();
}

Class User:

public partial class User
{
    public int UserID { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public Nullable<System.DateTime> BirthDate { get; set; }
}

I was wondering why this statement took so long until I set a breakpoint at Result and looked into the SQL statement generated:

{SELECT
`Apply1`.`UserID`, 
`Apply1`.`FIRSTNAME1` AS `FirstName`, 
`Apply1`.`LastName`, 
`Apply1`.`BirthDate`
FROM (SELECT
`Distinct1`.`FirstName`, 
(SELECT
`Project2`.`UserID`
FROM `User` AS `Project2`
 WHERE (`Distinct1`.`FirstName` = `Project2`.`FirstName`) OR ((`Distinct1`.`FirstName` IS  NULL) AND (`Project2`.`FirstName` IS  NULL))
 ORDER BY 
`Project2`.`BirthDate` ASC LIMIT 1) AS `UserID`, 
(SELECT
`Project2`.`FirstName`
FROM `User` AS `Project2`
 WHERE (`Distinct1`.`FirstName` = `Project2`.`FirstName`) OR ((`Distinct1`.`FirstName` IS  NULL) AND (`Project2`.`FirstName` IS  NULL))
 ORDER BY 
`Project2`.`BirthDate` ASC LIMIT 1) AS `FIRSTNAME1`, 
(SELECT
`Project2`.`LastName`
FROM `User` AS `Project2`
 WHERE (`Distinct1`.`FirstName` = `Project2`.`FirstName`) OR ((`Distinct1`.`FirstName` IS  NULL) AND (`Project2`.`FirstName` IS  NULL))
 ORDER BY 
`Project2`.`BirthDate` ASC LIMIT 1) AS `LastName`, 
(SELECT
`Project2`.`BirthDate`
FROM `User` AS `Project2`
 WHERE (`Distinct1`.`FirstName` = `Project2`.`FirstName`) OR ((`Distinct1`.`FirstName` IS  NULL) AND (`Project2`.`FirstName` IS  NULL))
 ORDER BY 
`Project2`.`BirthDate` ASC LIMIT 1) AS `BirthDate`
FROM (SELECT DISTINCT 
`Extent1`.`FirstName`
FROM `User` AS `Extent1`) AS `Distinct1`) AS `Apply1`}

Question: Is there a way to solve his more efficient? Sub-selects are expensive and EF generates one per column. I use mySQL .NET Connector version 6.9.5.0

like image 793
fubo Avatar asked Apr 22 '16 06:04

fubo


1 Answers

Using Jon Skeet's answer on distinct..

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

You can try:

using (ED.NWEntities ctx = new ED.NWEntities())
{
    IQueryable<ED.User> Result = ctx.User.OrderBy(y => y.BirthDate)
                                    .DistinctBy(z => z.FirstName)
                                    .AsQueryable();
}
like image 157
jegtugado Avatar answered Sep 22 '22 21:09

jegtugado