Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Questions about IEqualityComparer<T> / List<T>.Distinct()

Here is the equality comparer I just wrote because I wanted a distinct set of items from a list containing entities.

    class InvoiceComparer : IEqualityComparer<Invoice>
    {
        public bool Equals(Invoice x, Invoice y)
        {
            // A
            if (Object.ReferenceEquals(x, y)) return true;

            // B
            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null)) return false;

            // C
            return x.TxnID == y.TxnID;
        }

        public int GetHashCode(Invoice obj)
        {
            if (Object.ReferenceEquals(obj, null)) return 0;
            return obj.TxnID2.GetHashCode();
        }
    }
  1. Why does Distinct require a comparer as opposed to a Func<T,T,bool>?
  2. Are (A) and (B) anything other than optimizations, and are there scenarios when they would not act the expected way, due to subtleness in comparing references?
  3. If I wanted to, could I replace (C) with

    return GetHashCode(x) == GetHashCode(y)

like image 532
Aaron Anodide Avatar asked Dec 15 '11 21:12

Aaron Anodide


2 Answers

  1. So it can use hashcodes to be O(n) as opposed to O(n2)
  2. (A) is an optimization.
    (B) is necessary; otherwise, it would throw an NullReferenceException. If Invoice is a struct, however, they're both unnecessary and slower.
  3. No. Hashcodes are not unique
like image 196
SLaks Avatar answered Sep 28 '22 01:09

SLaks


  • A is a simple and quick way to ensure that both objects located at the same memory address so both references the same object.
  • B - if one of the references is null - obviuosly it does not make any sense doing equality comparision
  • C - no, sometimes GetHashCode() can return the same value for different objects (hash collision) so you should do equality comparison

Regarding the same hash code value for different objects, MSDN:

If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

like image 40
sll Avatar answered Sep 28 '22 02:09

sll