Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I go about implementing Object.GetHashCode() for complex equality?

Basically, I have the following so far:

class Foo {
    public override bool Equals(object obj)
    {
        Foo d = obj as Foo ;
        if (d == null)
            return false;

        return this.Equals(d);
    }

    #region IEquatable<Foo> Members

    public bool Equals(Foo other)
    {
        if (this.Guid != String.Empty && this.Guid == other.Guid)
            return true;
        else if (this.Guid != String.Empty || other.Guid != String.Empty)
            return false;

        if (this.Title == other.Title &&
            this.PublishDate == other.PublishDate &&
            this.Description == other.Description)
            return true;

        return false;
    }
}

So, the problem is this: I have a non-required field Guid, which is a unique identifier. If this isn't set, then I need to try to determine equality based on less accurate metrics as an attempt at determining if two objects are equal. This works fine, but it make GetHashCode() messy... How should I go about it? A naive implementation would be something like:

public override int GetHashCode() {
    if (this.Guid != String.Empty)
        return this.Guid.GetHashCode();

    int hash = 37;
    hash = hash * 23 + this.Title.GetHashCode();
    hash = hash * 23 + this.PublishDate.GetHashCode();
    hash = hash * 23 + this.Description.GetHashCode();
    return hash;
}

But what are the chances of the two types of hash colliding? Certainly, I wouldn't expect it to be 1 in 2 ** 32. Is this a bad idea, and if so, how should I be doing it?

like image 453
Matthew Scharley Avatar asked Jul 02 '09 01:07

Matthew Scharley


1 Answers

A very easy hash code method for custom classes is to bitwise XOR each of the fields' hash codes together. It can be as simple as this:

int hash = 0;
hash ^= this.Title.GetHashCode();
hash ^= this.PublishDate.GetHashCode();
hash ^= this.Description.GetHashCode();
return hash;

From the link above:

XOR has the following nice properties:

  • It does not depend on order of computation.
  • It does not “waste” bits. If you change even one bit in one of the components, the final value will change.
  • It is quick, a single cycle on even the most primitive computer.
  • It preserves uniform distribution. If the two pieces you combine are uniformly distributed so will the combination be. In other words, it does not tend to collapse the range of the digest into a narrower band.

XOR doesn't work well if you expect to have duplicate values in your fields as duplicate values will cancel each other out when XORed. Since you're hashing together three unrelated fields that should not be a problem in this case.

like image 58
John Kugelman Avatar answered Sep 23 '22 21:09

John Kugelman