Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explanation of HashMap#hash(int) method

Tags:

Can someone please explain to me the static HashMap#hash(int) method?

What's the justification behind it to generate uniformly distributed hashes?

/**  * Applies a supplemental hash function to a given hashCode, which  * defends against poor quality hash functions.  This is critical  * because HashMap uses power-of-two length hash tables, that  * otherwise encounter collisions for hashCodes that do not differ  * in lower bits. Note: Null keys always map to hash 0, thus index 0.  */ static int hash(int h) {     // This function ensures that hashCodes that differ only by     // constant multiples at each bit position have a bounded     // number of collisions (approximately 8 at default load factor).     h ^= (h >>> 20) ^ (h >>> 12);     return h ^ (h >>> 7) ^ (h >>> 4); } 

An example would make it easier to digest.

Clarification I'm aware of the operators, truth tables and bitwise operations. I just can't really decode the implementation nor the comment really. Or even the reasoning behind it.

like image 839
qnoid Avatar asked Mar 10 '10 02:03

qnoid


People also ask

What is HashMap and how it works internally?

HashMap contains an array of the nodes, and the node is represented as a class. It uses an array and LinkedList data structure internally for storing Key and Value. There are four fields in HashMap. Before understanding the internal working of HashMap, you must be aware of hashCode() and equals() method.

What is HashMap in Java with examples?

The HashMap class of the Java collections framework provides the functionality of the hash table data structure. It stores elements in key/value pairs. Here, keys are unique identifiers used to associate each value on a map. The HashMap class implements the Map interface.

What is HashMap hash function?

A hash table, also known as a hash map, is a data structure that maps keys to values. It is one part of a technique called hashing, the other of which is a hash function. A hash function is an algorithm that produces an index of where a value can be found or stored in the hash table.

How put method works in HashMap?

put() method of HashMap is used to insert a mapping into a map. This means we can insert a specific key and the value it is mapping to into a particular map. If an existing key is passed then the previous value gets replaced by the new value. If a new pair is passed, then the pair gets inserted as a whole.


1 Answers

>>> is the logical right shift (no sign-extension) (JLS 15.19 Shift Operators), and ^ is the bitwise exclusive-or (JLS 15.22.1 Integer Bitwise Operators).

As to why this is done, the documentation offers a hint: HashMap uses power-of-two length tables, and hashes keys by masking away the higher bits and taking only the lower bits of their hash code.

// HashMap.java -- edited for conciseness static int indexFor(int h, int length) {     return h & (length-1); }  public V put(K key, V value) {     int hash = hash(key.hashCode());     int index = indexFor(hash, table.length);     // ... } 

So hash() attempts to bring relevancy to the higher bits, which otherwise would get masked away (indexFor basically discards the higher bits of h and takes only the lower k bits where length == (1 << k)).

Contrast this with the way Hashtable (which should have NOT a power-of-two length table) uses a key's hash code.

// Hashtable.java -- edited for conciseness public synchronized V get(Object key) {     int hash = key.hashCode();     int index = (hash & 0x7FFFFFFF) % table.length;     // ... } 

By doing the more expensive % operation (instead of simple bit masking), the performance of Hashtable is less sensitive to hash codes with poor distribution in the lower bits (especially if table.length is a prime number).

like image 61
polygenelubricants Avatar answered Sep 21 '22 08:09

polygenelubricants