I was reading about the fact that How exactly the HashMap
works in java
.I found code in the hash
method in the HashMap
class the hashcode
is one of the operand with Shift right zero fill operator
.The other operands
are like 12
7
4
20
. Later some more processing is done on the result .My question is why only these four number are chossen for calculating the value in hash function which actually used for calculating the position in the bucket
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
It’s not that “only these four number are chosen for calculating the value in hash function”, the hash code returned by the hashCode
method of the key object is the (very important) input. This method in the HashMap
implementation just tries to improve this, given the knowledge about how the HashMap
will use that value afterwards.
Typical implementations will only use the lower bits of a hash code as the internal table has a size which is a power of two. Therefore the improvement shall ensure that the likelihood of having different values in the lower bits is the same even if the original hash codes for different keys differ in upper bits only.
Take for example Integer
instances used as keys: their hash code is identical to their value as this will spread the hash codes over the entire 2³² int range. But if you put the values 0xa0000000
, 0xb0000000
, 0xc0000000
, 0xd0000000
into the map, a map using only the lower bits would have poor results. This improvement fixes this.
The numbers chosen for this bit manipulation, and the algorithm in general are a field of continuous investigations. You will see changes between JVM implementations as the development never stops.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With