Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between IEquatable<T>, IEqualityComparer<T>, and overriding .Equals() when using LINQ on a custom object collection?

Tags:

c#

linq

I'm having some difficulty using Linq's .Except() method when comparing two collections of a custom object.

I've derived my class from Object and implemented overrides for Equals(), GetHashCode(), and the operators == and !=. I've also created a CompareTo() method.

In my two collections, as a debugging experiment, I took the first item from each list (which is a duplicate) and compared them as follows:

itemListA[0].Equals(itemListB[0]);     // true
itemListA[0] == itemListB[0];          // true
itemListA[0].CompareTo(itemListB[0]);  // 0

In all three cases, the result is as I wanted. However, when I use Linq's Except() method, the duplicate items are not removed:

List<myObject> newList = itemListA.Except(itemListB).ToList();

Learning about how Linq does comparisons, I've discovered various (conflicting?) methods that say I need to inherit from IEquatable<T> or IEqualityComparer<T> etc.

I'm confused because when I inherit from, for example, IEquatable<T>, I am required to provide a new Equals() method with a different signature from what I've already overridden. Do I need to have two such methods with different signatures, or should I no longer derive my class from Object?

My object definition (simplified) looks like this:

public class MyObject : Object
{
    public string Name {get; set;}
    public DateTime LastUpdate {get; set;}

    public int CompareTo(MyObject other)
    {
        // ...
    }

    public override bool Equals(object obj)
    {
        // allows some tolerance on LastUpdate
    }

    public override int GetHashCode()
    {
        unchecked
        {
            int hash = 17;
            hash = hash * 23 + Name.GetHashCode();
            hash = hash * 23 + LastUpdate.GetHashCode();
            return hash;
        }
    }

    // Overrides for operators
}

I noticed that when I inherit from IEquatable<T> I can do so using IEquatable<MyObject> or IEquatable<object>; the requirements for the Equals() signature change when I use one or the other. What is the recommended way?

What I am trying to accomplish:

I want to be able to use Linq (Distinct/Except) as well as the standard equality operators (== and !=) without duplicating code. The comparison should allow two objects to be considered equal if their name is identical and the LastUpdate property is within a number of seconds (user-specified) tolerance.

Edit:

Showing GetHashCode() code.

like image 722
JYelton Avatar asked Apr 03 '13 20:04

JYelton


3 Answers

It doesn't matter whether you override object.Equals and object.GetHashCode, implement IEquatable, or provide an IEqualityComparer. All of them can work, just in slightly different ways.

1) Overriding Equals and GetHashCode from object:

This is the base case, in a sense. It will generally work, assuming you're in a position to edit the type to ensure that the implementation of the two methods are as desired. There's nothing wrong with doing just this in many cases.

2) Implementing IEquatable

The key point here is that you can (and should) implement IEquatable<YourTypeHere>. The key difference between this and #1 is that you have strong typing for the Equals method, rather than just having it use object. This is both better for convenience to the programmer (added type safety) and also means that any value types won't be boxed, so this can improve performance for custom structs. If you do this you should pretty much always do it in addition to #1, not instead of. Having the Equals method here differ in functionality from object.Equals would be...bad. Don't do that.

3) Implementing IEqualityComparer

This is entirely different from the first two. The idea here is that the object isn't getting it's own hash code, or seeing if it's equal to something else. The point of this approach is that an object doesn't know how to properly get it's hash or see if it's equal to something else. Perhaps it's because you don't control the code of the type (i.e. a 3rd party library) and they didn't bother to override the behavior, or perhaps they did override it but you just want your own unique definition of "equality" in this particular context.

In this case you create an entirely separate "comparer" object that takes in two different objects and informs you of whether they are equal or not, or what the hash code of one object is. When using this solution it doesn't matter what the Equals or GetHashCode methods do in the type itself, you won't use it.


Note that all of this is entirely unrelated from the == operator, which is its own beast.

like image 195
Servy Avatar answered Oct 13 '22 18:10

Servy


The basic pattern I use for equality in an object is the following. Note that only 2 methods have actual logic specific to the object. The rest is just boiler plate code that feeds into these 2 methods

class MyObject : IEquatable<MyObject> { 
  public bool Equals(MyObject other) { 
    if (Object.ReferenceEquals(other, null)) {
      return false;
    }

    // Actual equality logic here
  }

  public override int GetHashCode() { 
    // Actual Hashcode logic here
  }

  public override bool Equals(Object obj) {
    return Equals(obj as MyObject);
  }

  public static bool operator==(MyObject left, MyObject right) { 
    if (Object.ReferenceEquals(left, null)) {
      return Object.ReferenceEquals(right, null);
    }
    return left.Equals(right);
  }

  public static bool operator!=(MyObject left, MyObject right) {
    return !(left == right);
  }
}

If you follow this pattern there is really no need to provide a custom IEqualityComparer<MyObject>. The EqualityComparer<MyObject>.Default will be enough as it will rely on IEquatable<MyObject> in order to perform equality checks

like image 24
JaredPar Avatar answered Oct 13 '22 17:10

JaredPar


You cannot "allow some tolerance on LastUpdate" and then use a GetHashCode() implementation that uses the strict value of LastUpdate!

Suppose the this instance has LastUpdate at 23:13:13.933, and the obj instance has 23:13:13.932. Then these two might compare equal with your tolerance idea. But if so, their hash codes must be the same number. But that will not happen unless you're extremely extremely lucky, for the DateTime.GetHashCode() should not give the same hash for these two times.

Besides, your Equals method most be a transitive relation mathematically. And "approximately equal to" cannot be made transitive. Its transitive closure is the trivial relation that identifies everything.

like image 5
Jeppe Stig Nielsen Avatar answered Oct 13 '22 19:10

Jeppe Stig Nielsen