Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashMap implementation in Java. How does the bucket index calculation work?

Tags:

java

hashmap

I am looking at the implementation of HashMap in Java and am stuck at one point.
How is the indexFor function calculated?

static int indexFor(int h, int length) {    return h & (length-1); } 

Thanks

like image 369
gnreddy Avatar asked Jun 04 '12 09:06

gnreddy


People also ask

How does HashMap determine bucket size?

Default capacity of Hashmap is 2^4 = 16 buckets. Let say we have well implemented hashcode() method, which make sure that key-value pair will be well distributed across 16 buckets equally. So, If there are 16 items in hashmap, then good hashcode method will distribute 1 item in each bucket.

How do HashMap buckets work?

HashMap uses multiple buckets and each bucket points to a Singly Linked List where the entries (nodes) are stored. Once the bucket is identified by the hash function using hashcode, then hashCode is used to check if there is already a key with the same hashCode or not in the bucket(singly linked list).

How HashMap find bucket location?

HashMap stores elements in so-called buckets and the number of buckets is called capacity. When we put a value in the map, the key's hashCode() method is used to determine the bucket in which the value will be stored. To retrieve the value, HashMap calculates the bucket in the same way – using hashCode().


2 Answers

The hash itself is calculated by the hashCode() method of the object you're trying to store.

What you see here is calculating the "bucket" to store the object based on the hash h. Ideally, to evade collisions, you would have the same number of buckets as is the maximum achievable value of h - but that could be too memory demanding. Therefore, you usually have a lower number of buckets with a danger of collisions.

If h is, say, 1000, but you only have 512 buckets in your underlying array, you need to know where to put the object. Usually, a mod operation on h would be enough, but that's too slow. Given the internal property of HashMap that the underlying array always has number of buckets equal to 2^n, the Sun's engineers could use the idea of h & (length-1), it does a bitwise AND with a number consisting of all 1's, practically reading only the n lowest bits of the hash (which is the same as doing h mod 2^n, only much faster).

Example:

     hash h: 11 1110 1000  -- (1000 in decimal)    length l: 10 0000 0000  -- ( 512 in decimal)       (l-1): 01 1111 1111  -- ( 511 in decimal - it will always be all ONEs) h AND (l-1): 01 1110 1000  -- ( 488 in decimal which is a result of 1000 mod 512) 
like image 190
Petr Janeček Avatar answered Oct 03 '22 14:10

Petr Janeček


It's not calculating the hash, it's calculating the bucket.

The expression h & (length-1) does a bit-wise AND on h using length-1, which is like a bit-mask, to return only the low-order bits of h, thereby making for a super-fast variant of h % length.

like image 36
Bohemian Avatar answered Oct 03 '22 14:10

Bohemian