I analyse the HashMap
source code in Java and get a question about the put
method.
Below is the put
method in JDK1.6:
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
I get confused about the if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
Why is the condition like this?
Because in Java super class Object
, there is a contract of hashCode
and equals
:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
So the key.equals(k)
implies key.hashCode() == k.hashCode()
.
The hash()
is below:
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
thus key.hashCode() == k.hashCode()
implies e.hash == hash
.
So why isn't the condition like if ((k = e.key) == key || key.equals(k))
?
In HashMap, hashCode() is used to calculate the bucket and therefore calculate the index. equals() method: This method is used to check whether 2 objects are equal or not. This method is provided by the Object class. You can override this in your class to provide your implementation.
HashCode collisions Whenever two different objects have the same hash code, we call this a collision. A collision is nothing critical, it just means that there is more than one object in a single bucket, so a HashMap lookup has to look again to find the right object.
1) If two Objects are equal according to equal(), then calling the hashcode method on each of those two objects should produce same hashcode. 2) It is not required that if two objects are unequal according to the equal(), then calling the hashcode method on each of the two objects must produce distinct values.
It's just an optimization: comparing two integers is faster than calling equals()
.
If the two hashcodes differ, then, based on the contract of equals
and hashCode
, the map knows that the existing key isn't equal to the given key, and can go faster to the next one.
It's just avoiding a method call when it can: If the hash (which isn't hashCode()
, it's the map's own hashing) is different from the entry's hash, it knows it doesn't have to call equals
at all. Just optimizing a bit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With