Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IEqualityComparer not working as intended

I have a List of paths of files stored on my computer. My aim is to first filter out the files which have the same name and and then filter out those which have the same size.
To do so, I have made two classes implementing IEqualityComparer<string>, and implemented Equals and GetHashCode methods.

var query = FilesList.Distinct(new CustomTextComparer())
                     .Distinct(new CustomSizeComparer()); 

The code for both of the classes is given below:-

public class CustomTextComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        if (Path.GetFileName(x) == Path.GetFileName(y))
        {
            return true;
        }
        return false; 
    }
    public int GetHashCode(string obj)
    {
        return obj.GetHashCode();
    }
}
public class CustomSizeComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        if (new FileInfo(x).Length == new FileInfo(y).Length)
        {
            return true;
        }
        else
        {
            return false;
        }
    }
    public int GetHashCode(string obj)
    {
        return obj.GetHashCode();
    }
}

But the code does not work.

It doesn't throw any exceptions nor is there any compiler error, but the problem is that the code doesn't work(doesn't exclude duplicate files).

So, how can I correct this problem? Is there anything I can do to make the code work correctly.

like image 904
Pratik Singhal Avatar asked Jan 28 '14 10:01

Pratik Singhal


2 Answers

Change your GetHashCode to work on the compared value. I.e. for your size comparer:

public int GetHashCode(string obj)
{
    return FileInfo(x).Length.GetHashCode();
}

And for the other:

public int GetHashCode(string obj)
{
    return Path.GetFileName(obj).GetHashCode();
}

According to this answer - What's the role of GetHashCode in the IEqualityComparer<T> in .NET?, the hash code is evaluated first. Equals is called in case of collision.

Obviously it would be sensible to work on FileInfos, not on strings.

So maybe:

FileList.Select(x => new FileInfo(x))
        .Distinct(new CustomTextComparer())
        .Distinct(new CustomSizeComparer());

Of course, then you have to change your comparers to work on the correct type.

like image 183
Piotr Zierhoffer Avatar answered Sep 21 '22 19:09

Piotr Zierhoffer


Your GetHashCode must return the same value for any objects that are of equal value:

// Try this
public int GetHashCode(string obj)
{
    return Path.GetFileName(x).GetHashCode();
}

// And this
public int GetHashCode(string obj)
{
    return new FileInfo(x).Length.GetHashCode();
}

But this is a much easier way for the whole problem without the extra classes:

var query = FilesList
                .GroupBy(f => Path.GetFileName(f)).Select(g => g.First())
                .GroupBy(f => new FileInfo(f).Length).Select(g => g.First())
                .ToList();
like image 39
Rick Love Avatar answered Sep 22 '22 19:09

Rick Love