Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does HashMap's get() compare both the hash value and key in Java?

I was looking at the implementation of HashMap in JDK8. In the get methods, I saw the below line which is used to find the Node that matches with the given key.

if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))

Why is there the need to compare the hash value along with key? Why is the line above not written as:

if (((k = e.key) == key) || (key != null && key.equals(k)))

Is there any explanation on why it is done this way? Thank you.

like image 814
Sandeep Misal Avatar asked Jun 05 '18 16:06

Sandeep Misal


2 Answers

What seems to be causing your confusion, are two things:

1. Comparing the hash values is (often very much) faster than comparing keys directly.

2. In a == operator, the second condition won't be checked if the first is false.

So first the hash values are compared, which is fast:

  • When they are not equal, you know the keys aren't equal as well, and you are done.

  • When they are equal, you don't know if the keys are equal as well, so you must compare keys, which is (relatively) slow.

Since most keys are not equal, most of the time you only compare hashes. Only when keys are equal (or when hashes are equal due to hash collisions) do you compare the keys, which is rare, and thus you have a performance benefit.

like image 101
Max Vollmer Avatar answered Sep 20 '22 13:09

Max Vollmer


This is an efficient way to check if two values can possibly be equal.

The contract for hashcode mandates:

JavaDocs of Object.hashCode

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

Therefore, if the hashes are distinct, there is not point in performing further checks.

As HashMap requires the hashcodes for the keys anyway for choosing the buckets to put the entries in, the tradeoff is storing an additional int per entry vs. possibly having to compute it repeatedly and having to execute equals for the keys more often. HashMap is more optimized for fast retrieval and insertion, less so for memory efficiency.


Side Note: HashMap relies on keys not being mutated in any way that would change their "identity" in terms of equals and hashcode - while this may seem obvious, it is not explicitly mentioned in the JavaDocs for HashMap and has led to questions in the past: Are mutable hashmap keys a dangerous practice? - this is covered by the more general Map contract:

Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.

like image 44
Hulk Avatar answered Sep 20 '22 13:09

Hulk