Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count of distinct with linq to entities and custom IEqualityComparer

My custom comparer doesn't seem to be working. I want a count of distinct objects, but I am getting a count of 1 every time. Even though looking at the database itself clearly shows there are more 1 instances of the query with distinct 'TimeOfAction' values.

 class TimeComparer : IEqualityComparer<Action>
{
    public bool Equals(Action a, Action b)
    {
        if (a.TimeOfAction == b.TimeOfAction)
            return true;
        else
            return false;
    }

    public int GetHashCode(Action obj)
    {
        return obj.ToString().ToLower().GetHashCode();
    }
}

Think it could be the GetHashCode method, as I'm not too familiar with how its meant to work. Here is the linq query. I converted to AsEnumerable since Linq to Entities does not support the distinct method.

DBEntities db = new DBEntities();

        IEnumerable<Action> query = 
                    from action in db.Action.AsEnumerable()
                    where action.TimeOfAction > new DateTime(2010, 11, 1, 0, 0, 0)
                    where action.TimeOfAction < new DateTime(2011, 2, 7, 0, 0, 0)
                    where action.EntityName == "seant"
                    select action;

var count = query.   
            Distinct(new TimeComparer()).Count();
like image 977
Sean Avatar asked Dec 29 '22 01:12

Sean


1 Answers

Your Equals and GetHashCode methods are taking completely different approaches. In particular, equal objects may well have different hash codes, assuming Action.ToString uses fields other than TimeOfAction. They have to be aligned, or you're going to have absolutely no chance of getting reasonable results. It's okay for unequal objects to have the same hashcode (although that will hinder performance) but equal objects must give the same hash code.

Note that using a custom comparer will force the Distinct part to be done in-process rather than in the database. That may well not be a problem, you just need to understand it. EDIT: I hadn't spotted that there's an overload of Queryable.Distinct which does take an IEqualityComparer<T>. My guess is that that's so you can provide custom string comparers and a few other well-known comparers... not just arbitrary code. If it works, it's going to be done locally anyway. I wouldn't be surprised if it just blew up though.

EDIT: As Marc says, you can use Select(x => x.TimeOfAction).Distinct().Count() to do this in the database. You'll need to remove the call to AsEnumerable too though. My guess is that that's there because something else wasn't working. You could try this:

DBEntities db = new DBEntities();
IQueryable<DateTime> query = 
            from action in db.Action
            where action.TimeOfAction > new DateTime(2010, 11, 1, 0, 0, 0)
            where action.TimeOfAction < new DateTime(2011, 2, 7, 0, 0, 0)
            where action.EntityName == "seant"
            select action.TimeOfAction;
var count = query.Distinct().Count();

Of course if you needed query for something else you'd need to keep the original version too:

DBEntities db = new DBEntities();
IQueryable<Action> query = 
            from action in db.Action
            where action.TimeOfAction > new DateTime(2010, 11, 1, 0, 0, 0)
            where action.TimeOfAction < new DateTime(2011, 2, 7, 0, 0, 0)
            where action.EntityName == "seant"
            select action;

var count = query.Select(x => x.TimeOfAction).Distinct().Count();
// Use query here as well to get at full action details

Note that using query again will produce a second database query. You'll need to look at what's going on in terms of transactions if you need the count to be consistent with whatever the second query does... or pull all the details from the database (using a ToList call) and then do the Distinct part in-process.


Back to custom equality comparers...

Assuming TimeOfAction is a DateTime or some other type which has a reasonable hash code, you can change your class to:

class TimeComparer : IEqualityComparer<Action>
{
    public bool Equals(Action a, Action b)
    {
        return a.TimeOfAction == b.TimeOfAction;
    }

    public int GetHashCode(Action obj)
    {
        return obj.TimeOfAction.GetHashCode();
    }
}

Note that I've also simplified your Equals method - any time you find yourself with:

if (condition)
{
    return true;
}
else
{
    return false;
}

you can simplify it to:

return condition;
like image 106
Jon Skeet Avatar answered Feb 01 '23 17:02

Jon Skeet