I was going through Java's implementation of the put method for a hashtable and came across this :
// Makes sure the key is not already in the hashtable.
Entry tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
V old = e.value;
e.value = value;
return old;
}
}
While I understand that a key is required to check for collisions, why is Java storing the hash value of the key and also checking it ?
A hash table, also known as a dictionary or associative array, is a compact data structure that stores one or more key/value pairs.
It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.
Like HashMap, Hashtable stores key/value pairs in a hash table. When using a Hashtable, you specify an object that is used as a key, and the value that you want linked to that key. The key is then hashed, and the resulting hash code is used as the index at which the value is stored within the table.
Hash tables let us implement things like phone books or dictionaries; in them, we store the association between a value (like a dictionary definition of the word "lamp") and its key (the word "lamp" itself). We can use hash tables to store, retrieve, and delete data uniquely based on their unique key.
Because the same bucket (tab
) can hold items having different hashes due to % tab.length
operation. Checking hash first is probably some performance optimization to avoid calling equals()
if hashes are different.
To put this in an example: Say you have two complex objects with costly equals()
method. One object has hash equal to 1 while the other object has hash of 32. If you put both objects in a hash table having 31 buckets, they'll end up in the same bucket (tab
). When adding a second (different object) you must make sure it's not yet in the table. You can use equals()
immediately, but this might be slower. Instead you first compare hashes, avoiding costly equals()
if not necessary. In this example hashes are different (despite being in the same bucket) so equals()
is not necessary.
It makes access faster, since the hash value does not need to be recomputed for every access. This is important not just for explicit searches (where the hash is checked before doing equals
) but also for rehash.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With