I am wondering what would be the time complexity on Java
HashMap
resizing when the load factor exceeds the threshold ? As far as I understand for HashMap the table size is always power of 2 an even number, so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong), all we need to do is to allocate additional spaces without and copy over all the entries from the old table (I am not quite sure how does JVM deal with that internally), correct ? Whereas for Hashtable
since it uses a prime number as the table size, so we need to rehash all the entries whenever we re-size the table. So my question is does it still take O(n) linear time for resizing on HashMap
?
Since the size of the new array has changed, the indexing function (which returns the bitwise operation “hash(key) AND (sizeOfArray-1)”) changes. So, the resizing of the array creates twice more buckets (i.e. linked lists) and redistributes all the existing entries into the buckets (the old ones and the newly created).
Hashmap is generally resized at 0.75 ( = 3/4 in the fraction) load factor. Using this information, you can set up a hashmap of 4/3 times the count of the entries you need to store.
On an average the time complexity of a HashMap insertion, deletion, the search takes O(1) constant time. That said, in the worst case, java takes O(n) time for searching, insertion, and deletion. Java uses chaining and rehashing to handle collisions.
As soon as 13th element (key-value pair) will come into the Hashmap, it will increase its size from default 24 = 16 buckets to 25 = 32 buckets. Another way to calculate size: When the load factor ratio (m/n) reaches 0.75 at that time, hashmap increases its capacity.
Does it still take
O(N)
time for resizing aHashMap
?
Basically, yes.
And a consequence is that an insertion operation that causes a resize will take O(N)
time. But that happens on O(1/N)
of all insertions, so (under certain assumptions) the average insertion time is O(1)
.
so could a good load factor affect this performance ? like better and faster than
O(N)
?
Choice of load factor affects performance, but not complexity.
If we make normal assumptions about the hash function and key clustering, when the load factor is larger:
O(1)
,O(1/N)
,O(N)
.... so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong.
Actually, you would need to rehash all of the keys. When you double the hash table size, the hash chains need to be split. To do this, you need to test which of two chains the hash value for every key maps to. (Indeed, you need to do the same if the hash table had an open organization too.)
However, in the current generation of HashMap
implementations, the hashcode values are cached in the chained entry objects, so that the hashcode for a key doesn't ever need to be recomputed.
One comment mentioned the degenerate case where all keys hash to the same hashcode. That can happen either due to a poorly designed hash function, or a skewed distribution of keys.
This affects performance of lookup, insertion and other operations, but it does not affect either the cost or frequency of resizes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With