A quote from the book I'm reading Head First Java:
The point is that hashcodes can be the same without necessarily guaranteeing that the objects are equal, because the "hashing algorithm" used in the
hashCode()
method might happen to return the same value for multiple objects.
Why might the hashCode()
method return the same value for different objects? Does that not cause problems?
1) If two objects are equal (i.e. the equals() method returns true), they must have the same hashcode. 2) If the hashCode() method is called multiple times on the same object, it must return the same result every time. 3) Two different objects can have the same hash code.
The general contract of hashCode() states: Whenever it is invoked on the same object more than once during an execution of a Java application, hashCode() must consistently return the same value, provided no information used in equals comparisons on the object is modified.
Whenever two different objects have the same hash code, we call this a collision. A collision is nothing critical, it just means that there is more than one object in a single bucket, so a HashMap lookup has to look again to find the right object.
If two objects have the same hashcode then they are NOT necessarily equal. Otherwise you will have discovered the perfect hash function. But the opposite is true: if the objects are equal, then they must have the same hashcode .
hashing an object means "finding a good, descriptive value (number) that can be reproduced by the very same instance again and again". Because hash codes from Java's Object.hashCode()
are of type int
, you can only have 2^32
different values. That's why you will have so-called "collisions" depending on the hashing algorithm, when two distinct Objects produce the same hashCode.
Typically, this does not produce any problems, because hashCode()
is mostly used together with equals()
. For instance, a HashMap
will call hashCode()
upon its keys, to know whether the keys may already be contained in the HashMap. If the HashMap does not find the hash code, it's obvious the key is not contained in the HashMap yet. But if it does, it will have to double-check all keys having that same hash code using equals()
.
I.e.
A.hashCode() == B.hashCode() // does not necessarily mean
A.equals(B)
But
A.equals(B) // means
A.hashCode() == B.hashCode()
If equals()
and hashCode()
are implemented correctly.
For a more precise description of the general hashCode
contract, see the Javadoc.
There are only just over 4 billion possible hashcodes (the range of an int
) , but the number of objects you could choose to create is much larger. Therefore some objects must share the same hash code, by the pigeonhole principle.
For example the number of possible strings containing 10 letters from A-Z is 26**10 which is 141167095653376. It is impossible to assign all of these strings a unique hash code. Nor is it important - the hash code doesn't need to be unique. It just needs to have not too many collisions for real data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With