As the title suggests this is a question about an implementation detail from HashMap#resize
- that's when the inner array is doubled in size. It's a bit wordy, but I've really tried to prove that I did my best understanding this...
This happens at a point when entries in this particular bucket/bin are stored in a Linked
fashion - thus having an exact order and in the context of the question this is important.
Generally the resize
could be called from other places as well, but let's look at this case only.
Suppose you put these strings as keys in a HashMap
(on the right there's the hashcode
after HashMap#hash
- that's the internal re-hashing.) Yes, these are carefully generated, not random.
DFHXR - 11111 YSXFJ - 01111 TUDDY - 11111 AXVUH - 01111 RUTWZ - 11111 DEDUC - 01111 WFCVW - 11111 ZETCU - 01111 GCVUR - 11111
There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put
, the resize
will be called.
So if currently there are 8 entries (having one of the keys above) in the HashMap
- it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.
We put the nine-th key. At this point TREEIFY_THRESHOLD
is hit and resize
is called. The bins are doubled to 32
and one more bit from the keys decides where that entry will go (so, 5 bits now).
Ultimately this piece of code is reached (when resize
happens):
Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; }
It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.
And it's actually pretty smart how it does that - it's via this piece of code:
if ((e.hash & oldCap) == 0)
What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.
And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.
So after you put those 9 keys in the HashMap
the order is going to be :
DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin) YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)
Why would you want to preserve order of some entries in the HashMap
. Order in a Map
is really bad as detailed here or here.
In Oracle JDK 8, HashMap resizes when the size is > threshold (capacity * load factor). With capacity of 16 and default load factor of 0.75 , resizing (to capacity of 32 ) takes place when the 13 th entry is put.
Hashmap uses the array of Nodes(named as table), where Node has fields like the key, value (and much more). Here the Node is represented by class HashMapEntry. Basically, HashMap has an array where the key-value data is stored. It calculates the index in the array where the Node can be placed and it is placed there.
1) HashMap handles collision by using a linked list to store map entries ended up in same array location or bucket location. 2) From Java 8 onwards, HashMap, ConcurrentHashMap, and LinkedHashMap will use the balanced tree in place of linked list to handle frequently hash collisions.
The design consideration has been documented within the same source file, in a code comment in line 211
* When bin lists are treeified, split, or untreeified, we keep * them in the same relative access/traversal order (i.e., field * Node.next) to better preserve locality, and to slightly * simplify handling of splits and traversals that invoke * iterator.remove. When using comparators on insertion, to keep a * total ordering (or as close as is required here) across * rebalancings, we compare classes and identityHashCodes as * tie-breakers.
Since removing mappings via an iterator can’t trigger a resize, the reasons to retain the order specifically in resize
are “to better preserve locality, and to slightly simplify handling of splits”, as well as being consistent regarding the policy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With