How is HashMap
internally implemented? I read somewhere that it uses LinkedList
while other places it mentions Arrays.
I tried studying the code for HashSet
and found Entry
array. Then where is LinkedList
used?
HashMap contains an array of the nodes, and the node is represented as a class. It uses an array and LinkedList data structure internally for storing Key and Value.
Internally, the HashMap uses an Array, and it maps the labels to array indexes using a hash function. There are at least two ways to implement hashmap: Array: Using a hash function to map a key to the array index value.
HashMap uses its static inner class Node<K,V> for storing map entries. That means each entry in hashMap is a Node . Internally HashMap uses a hashCode of the key Object and this hashCode is further used by the hash function to find the index of the bucket where the new entry can be added.
Looking at the source code of HashMap tells us that it's implemented as an array of Entries: transient Entry[] table; Each Entry has a field next , so they create a single linked list structure: static class Entry<K,V> implements Map.
It basically looks like this:
this is the main array
↓
[Entry] → Entry → Entry ← here is the linked-list
[Entry]
[Entry] → Entry
[Entry]
[null ]
[null ]
So you have the main array where each index corresponds to some hash value (mod
'ed* to the size of the array).
Then each of them will point to the next entry with the same hash value (again mod
'ed*). This is where the linked-list comes in.
*: As a technical note, it's first hashed with a different function before being mod
'ed, but, as a basic implementation, just modding will work.
Each HashMap
has an Array and in that Array it places each Entry
in a position according to its key's hash code (e.g. int position = entry.getKey().hashCode() % array.length
). The position where an Entry
is stored is called a bucket.
If more than one Entry
ends up in the same bucket, those Entries are combined in a LinkedList
(also see @Dukeling's answer). Thus the bucket metaphor: each Array index is a "bucket" where you dump in all matching keys.
You have to use an Array for the buckets in order to achieve the desired constant time performance for random access. Within a bucket you have to traverse all elements to find the desired key anyways, so you can use a LinkedList
as it is easier to append to (no resize needed).
This also shows the need for a good hash function, because if all keys hash to only a few values you will get long LinkedList
s to search and a lot of (fast to access) empty buckets.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With