Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distinct() returns duplicates with a user-defined type

Tags:

c#

linq

I'm trying to write a Linq query which returns an array of objects, with unique values in their constructors. For integer types, Distinct returns only one copy of each value, but when I try creating my list of objects, things fall apart. I suspect it's a problem with the equality operator for my class, but when I set a breakpoint, it's never hit.

Filtering out the duplicate int in a sub-expression solves the problem, and also saves me from constructing objects that will be immediately discarded, but I'm curious why this version doesn't work.

UPDATE: 11:04 PM Several folks have pointed out that MyType doesn't override GetHashCode(). I'm afraid I oversimplified the example. The original MyType does indeed implement it. I've added it below, modified only to put the hash code in a temp variable before returning it.

Running through the debugger, I see that all five invocations of GetHashCode return a different value. And since MyType only inherits from Object, this is presumably the same behavior Object would exhibit.

Would I be correct then to conclude that the hash should instead be based on the contents of Value? This was my first attempt at overriding operators, and at the time, it didn't appear that GetHashCode needed to be particularly fancy. (This is the first time one of my equality checks didn't seem to work properly.)

class Program
{
    static void Main(string[] args)
    {
        int[] list = { 1, 3, 4, 4, 5 };
        int[] list2 =
            (from value in list
             select value).Distinct().ToArray();    // One copy of each value.
        MyType[] distinct =
            (from value in list
             select new MyType(value)).Distinct().ToArray(); // Two objects created with 4.

        Array.ForEach(distinct, value => Console.WriteLine(value));
    }
}

class MyType
{
    public int Value { get; private set; }

    public MyType(int arg)
    {
        Value = arg;
    }

    public override int GetHashCode()
    {
        int retval = base.GetHashCode();
        return retval;
    }

    public override bool Equals(object obj)
    {
        if (obj == null)
            return false;

        MyType rhs = obj as MyType;
        if ((Object)rhs == null)
            return false;

        return this == rhs;
    }

    public static bool operator ==(MyType lhs, MyType rhs)
    {
        bool result;

        if ((Object)lhs != null && (Object)rhs != null)
            result = lhs.Value == rhs.Value;
        else
            result = (Object)lhs == (Object)rhs;

        return result;
    }

    public static bool operator !=(MyType lhs, MyType rhs)
    {
        return !(lhs == rhs);
    }
}
like image 520
ThatBlairGuy Avatar asked Nov 29 '10 23:11

ThatBlairGuy


4 Answers

You need to override GetHashCode() in your class. GetHashCode must be implemented in tandem with Equals overloads. It is common for code to check for hashcode equality before calling Equals. That's why your Equals implementation is not getting called.

like image 197
dthorpe Avatar answered Nov 15 '22 07:11

dthorpe


Your suspicion is correct,it is the equality which currently just checks the object references. Even your implementation does not do anything extra, change it to this:

public override bool Equals(object obj)
{
    if (obj == null)
        return false;

    MyType rhs = obj as MyType;
    if ((Object)rhs == null)
        return false;

    return this.Value == rhs.Value;
}
like image 25
Aliostad Avatar answered Nov 15 '22 07:11

Aliostad


It seems that a simple Distinct operation can be implemented more elegantly as follows:

var distinct = items.GroupBy(x => x.ID).Select(x => x.First());

where ID is the property that determines if two objects are semantically equivalent. From the confusion here (including that of myself), the default implementation of Distinct() seems to be a little convoluted.

like image 34
Mike Chamberlain Avatar answered Nov 15 '22 09:11

Mike Chamberlain


You need to implement GetHashCode().

like image 43
leiz Avatar answered Nov 15 '22 07:11

leiz