Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparison via Equals or HashCode. which is faster?

I have to compare a object with the raw properties of the same class. Meaning, i have to compare those:

struct Identifier
{
    string name;
    string email;
}

with the two strings name and email. I know i could just create a new Identifier instance for name and email and pass that into equals(). My application has to be very fast and resource-saving.

I know that comparison via hashcode isn't a good way, because as explained here there are collisions. But collisions are okay for me, i just need it to be fast.

So,

1) is comparison via GetHashCode (check if the hashcode of both objects are the same) faster than Equals()?

2) Should i instead creating a new instance of Identifier of the two values for the comparison, make a new method which takes the values directly? e.g.

struct Identifier {
  string name;
  string email;

  bool Equals(string name, string email) {
      // todo comparison via hashcode or equals
  }
}

I would use the Equals() and GetHashCode() method generated by resharper.

like image 869
michidk Avatar asked Mar 08 '16 11:03

michidk


2 Answers

Comparing hash codes could be faster if you save them on the Identifier instance (see below). However, it is not the same thing as comparing for equality.

Comparing hash codes lets you check if two items are definitely not equal to each other: you know this when you get different hash codes.

When hash codes are equal, however, you cannot make a definitive statement about the equality: the items could be equal or not equal to each other. That is why hash-based containers must always follow hash code comparison, direct or indirect, with a comparison for equality.

Try implementing the comparison like this:

struct Identifier {
    string name;
    string email;
    int nameHash;
    int emailHash;
    public Identifier(string name, string email) {
        this.name = name;
        nameHash = name.GetHashCode();
        this.email = email;
        emailHash = email.GetHashCode();
    }
    bool Equals(string name, string email) {
        return name.GetHashCode() == nameHash
            && email.GetHashCode() == emailHash
            && name.equals(this.name)
            && email.equals(this.email);
    }
}

Comparing to pre-computed hash code would short-circuit the actual equality comparison, so you could save some CPU cycles when most of the comparisons end up returning false.

like image 125
Sergey Kalinichenko Avatar answered Sep 30 '22 08:09

Sergey Kalinichenko


is comparison via GetHashCode (check if the hashcode of both objects are the same) faster than Equals()?

You seem to be confusing the two concepts. GetHashCode's purpose isn't to seek equality between two object instances, it is there simply so each object can easily provide a hashcode value for any external resources that may relay on it.

Equals, on the other hand, is there to determine equality. It should be that two methods which yield true for equals, provide the same hashcode, but not the other way around.

The documentation on object.GetHashCode provides a pretty good explanation:

Two objects that are equal return hash codes that are equal. However, the reverse is not true: equal hash codes do not imply object equality, because different (unequal) objects can have identical hash codes. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method returns may differ between .NET Framework versions and platforms, such as 32-bit and 64-bit platforms. For these reasons, do not use the default implementation of this method as a unique object identifier for hashing purposes. Two consequences follow from this:

  • You should not assume that equal hash codes imply object equality.
  • You should never persist or use a hash code outside the application domain in which it was created, because the same object may hash across application domains, processes, and platforms.

If you want to check for equality between two instances, I definitely recommend implementing IEquatable<T> and overriding object.GetHashCode.

As a side note - I see that you're using a struct. You should note that struct has different semantics in C# than in C++ or C, and I hope you're aware of them.

like image 42
Yuval Itzchakov Avatar answered Sep 30 '22 09:09

Yuval Itzchakov