Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LinQ distinct with custom comparer leaves duplicates

I've got the following classes:

public class SupplierCategory : IEquatable<SupplierCategory>
{
    public string Name { get; set; }
    public string Parent { get; set; }

    #region IEquatable<SupplierCategory> Members

    public bool Equals(SupplierCategory other)
    {
        return this.Name == other.Name && this.Parent == other.Parent;
    }

    #endregion
}

public class CategoryPathComparer : IEqualityComparer<List<SupplierCategory>>
{
    #region IEqualityComparer<List<SupplierCategory>> Members

    public bool Equals(List<SupplierCategory> x, List<SupplierCategory> y)
    {
        return x.SequenceEqual(y);
    }

    public int GetHashCode(List<SupplierCategory> obj)
    {
        return obj.GetHashCode();
    }

    #endregion
}

And i'm using the following linq query:

CategoryPathComparer comparer = new CategoryPathComparer();
List<List<SupplierCategory>> categoryPaths = (from i in infoList
                                                          select
                                                            new List<SupplierCategory>() { 
                                                             new SupplierCategory() { Name = i[3] },
                                                             new SupplierCategory() { Name = i[4], Parent = i[3] },
                                                             new SupplierCategory() { Name = i[5], Parent = i[4] }}).Distinct(comparer).ToList();

But the distinct does not do what I want it to do, as the following code demonstrates:

comp.Equals(categoryPaths[0], categoryPaths[1]); //returns True

Am I using this in a wrong way? why are they not compared as I intend them to?

Edit: To demonstrate the the comparer does work, the following returns true as it should:

List<SupplierCategory> list1 = new List<SupplierCategory>() {
    new SupplierCategory() { Name = "Cat1" },
    new SupplierCategory() { Name = "Cat2", Parent = "Cat1" },
    new SupplierCategory() { Name = "Cat3", Parent = "Cat2" }
};
List<SupplierCategory> list1 = new List<SupplierCategory>() {
    new SupplierCategory() { Name = "Cat1" },
    new SupplierCategory() { Name = "Cat2", Parent = "Cat1" },
    new SupplierCategory() { Name = "Cat3", Parent = "Cat2" }
};
CategoryPathComparer comp = new CategoryPathComparer();
Console.WriteLine(comp.Equals(list1, list2).ToString());
like image 786
NDM Avatar asked Oct 25 '09 13:10

NDM


2 Answers

Your problem is that you didn't implement IEqualityComparer correctly.

When you implement IEqualityComparer<T>, you must implement GetHashCode so that any two equal objects have the same hashcode.

Otherwise, you will get incorrect behavior, as you're seeing here.

You should implement GetHashCode as follows: (courtesy of this answer)

public int GetHashCode(List<SupplierCategory> obj) {
    int hash = 17;

    foreach(var value in obj)
        hash = hash * 23 + obj.GetHashCode();

    return hash;
}

You also need to override GetHashCode in SupplierCategory to be consistent. For example:

public override int GetHashCode() {
    int hash = 17;
    hash = hash * 23 + Name.GetHashCode();
    hash = hash * 23 + Parent.GetHashCode();
    return hash;
}

Finally, although you don't need to, you should probably override Equals in SupplierCategory and make it call the Equals method you implemented for IEquatable.

like image 80
SLaks Avatar answered Oct 30 '22 12:10

SLaks


Actually, this issue is even covered in documentation: http://msdn.microsoft.com/en-us/library/bb338049.aspx.

like image 37
Alexandra Rusina Avatar answered Oct 30 '22 12:10

Alexandra Rusina