Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Distinct on IEnumerable<T> with custom IEqualityComparer

Here's what I'm trying to do. I'm querying an XML file using LINQ to XML, which gives me an IEnumerable<T> object, where T is my "Village" class, filled with the results of this query. Some results are duplicated, so I would like to perform a Distinct() on the IEnumerable object, like so:

public IEnumerable<Village> GetAllAlliances()
{
    try
    {
        IEnumerable<Village> alliances =
             from alliance in xmlDoc.Elements("Village")
             where alliance.Element("AllianceName").Value != String.Empty
             orderby alliance.Element("AllianceName").Value
             select new Village
             {
                 AllianceName = alliance.Element("AllianceName").Value
             };

        // TODO: make it work...
        return alliances.Distinct(new AllianceComparer());
    }
    catch (Exception ex)
    {
        throw new Exception("GetAllAlliances", ex);
    }
}

As the default comparer would not work for the Village object, I implemented a custom one, as seen here in the AllianceComparer class:

public class AllianceComparer : IEqualityComparer<Village>
{
    #region IEqualityComparer<Village> Members
    bool IEqualityComparer<Village>.Equals(Village x, Village y)
    {
        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) 
            return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.AllianceName == y.AllianceName;
    }

    int IEqualityComparer<Village>.GetHashCode(Village obj)
    {
        return obj.GetHashCode();
    }
    #endregion
}

The Distinct() method doesn't work, as I have exactly the same number of results with or without it. Another thing, and I don't know if it's usually possible, but I cannot step into AllianceComparer.Equals() to see what could be the problem.
I've found examples of this on the Internet, but I can't seem to make my implementation work.

Hopefully, someone here might see what could be wrong here! Thanks in advance!

like image 937
Fueled Avatar asked Jan 11 '09 12:01

Fueled


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.

What is C in C language?

What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.


3 Answers

The problem is with your GetHashCode. You should alter it to return the hash code of AllianceName instead.

int IEqualityComparer<Village>.GetHashCode(Village obj)
{
    return obj.AllianceName.GetHashCode();
}

The thing is, if Equals returns true, the objects should have the same hash code which is not the case for different Village objects with same AllianceName. Since Distinct works by building a hash table internally, you'll end up with equal objects that won't be matched at all due to different hash codes.

Similarly, to compare two files, if the hash of two files are not the same, you don't need to check the files themselves at all. They will be different. Otherwise, you'll continue to check to see if they are really the same or not. That's exactly what the hash table that Distinct uses behaves.

like image 77
mmx Avatar answered Nov 05 '22 17:11

mmx


return alliances.Select(v => v.AllianceName).Distinct();

That would return an IEnumerable<string> instead of IEnumerable<Village>.

like image 20
superrcat Avatar answered Nov 05 '22 17:11

superrcat


Or change the line

return alliances.Distinct(new AllianceComparer());

to

return alliances.Select(v => v.AllianceName).Distinct();
like image 7
Jacob Seleznev Avatar answered Nov 05 '22 17:11

Jacob Seleznev