Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linq distinct doesn't call Equals method

Tags:

c#

linq

I have the following class

public class ModInfo : IEquatable<ModInfo>
{
    public int ID { get; set; }
    public string MD5 { get; set; }

    public bool Equals(ModInfo other)
    {
        return other.MD5.Equals(MD5);
    }

    public override int GetHashCode()
    {
        return MD5.GetHashCode();
    }
}

I load some data into a list of that class using a method like this:

public void ReloadEverything() {
    var beforeSort = new List<ModInfo>();
    // Bunch of loading from local sqlite database. 
    // not included since it's reload boring to look at
    var modinfo = beforeSort.OrderBy(m => m.ID).AsEnumerable().Distinct().ToList();
}

Problem is the Distinct() call doesn't seem to do it's job. There are still objects which are equals each other.

Acording to this article: https://msdn.microsoft.com/en-us/library/vstudio/bb348436%28v=vs.100%29.aspx that is how you are supposed to make distinct work, however it doesn't seem to be calling to Equals method on the ModInfo object. What could be causing this to happen?

Example values:

modinfo[0]: id=2069, MD5 =0AAEBF5D2937BDF78CB65807C0DC047C
modinfo[1]: id=2208, MD5 = 0AAEBF5D2937BDF78CB65807C0DC047C

I don't care which value gets chosen, they are likely to be the same anyway since the md5 value is the same.

like image 864
Rasmus Hansen Avatar asked Apr 06 '15 11:04

Rasmus Hansen


People also ask

Why distinct is not working in Linq?

LINQ Distinct is not that smart when it comes to custom objects. All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields). One workaround is to implement the IEquatable interface as shown here.

How Distinct works in LINQ?

C# Linq Distinct() method removes the duplicate elements from a sequence (list) and returns the distinct elements from a single data source. It comes under the Set operators' category in LINQ query operators, and the method works the same way as the DISTINCT directive in Structured Query Language (SQL).


1 Answers

You also need to override Object.Equals, not just implement IEquatable.

If you add this to your class:

public override bool Equals(object other)
{
    ModInfo mod = other as ModInfo;
    if (mod != null)
        return Equals(mod);
    return false;
}

It should work.

See this article for more info: Implementing IEquatable Properly

EDIT: Okay, here's a slightly different implementation based on best practices with GetHashCode.

public class ModInfo : IEquatable<ModInfo>
{
    public int ID { get; set; }
    public string MD5 { get; set; }

    public bool Equals(ModInfo other)
    {
        if (other == null) return false;
        return (this.MD5.Equals(other.MD5));
    }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 13;
            hash = (hash * 7) + MD5.GetHashCode();
            return hash;
        }
    }

    public override bool Equals(object obj)
    {
        ModInfo other = obj as ModInfo;
        if (other != null)
        {
            return Equals(other);
        }
        else
        {
            return false;
        }
    }
}

You can verify it:

ModInfo mod1 = new ModInfo {ID = 1, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};
ModInfo mod2 = new ModInfo {ID = 2, MD5 = "0AAEBF5D2937BDF78CB65807C0DC047C"};

// You should get true here
bool areEqual = mod1.Equals(mod2);

List<ModInfo> mods = new List<ModInfo> {mod1, mod2};

// You should get 1 result here
mods = mods.Distinct().ToList();

What's with those specific numbers in GetHashCode?

like image 50
Fred Kleuver Avatar answered Oct 23 '22 10:10

Fred Kleuver