Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time complexity for Java HashMap resizing

I am wondering what would be the time complexity on Java HashMap resizing when the load factor exceeds the threshold ? As far as I understand for HashMap the table size is always power of 2 an even number, so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong), all we need to do is to allocate additional spaces without and copy over all the entries from the old table (I am not quite sure how does JVM deal with that internally), correct ? Whereas for Hashtable since it uses a prime number as the table size, so we need to rehash all the entries whenever we re-size the table. So my question is does it still take O(n) linear time for resizing on HashMap ?

like image 528
peter Avatar asked Jan 10 '13 05:01

peter


People also ask

What happens when HashMap resize?

Since the size of the new array has changed, the indexing function (which returns the bitwise operation “hash(key) AND (sizeOfArray-1)”) changes. So, the resizing of the array creates twice more buckets (i.e. linked lists) and redistributes all the existing entries into the buckets (the old ones and the newly created).

Can we resize HashMap in Java?

Hashmap is generally resized at 0.75 ( = 3/4 in the fraction) load factor. Using this information, you can set up a hashmap of 4/3 times the count of the entries you need to store.

What is the time complexity of a HashMap?

On an average the time complexity of a HashMap insertion, deletion, the search takes O(1) constant time. That said, in the worst case, java takes O(n) time for searching, insertion, and deletion. Java uses chaining and rehashing to handle collisions.

How much HashMap increases its size?

As soon as 13th element (key-value pair) will come into the Hashmap, it will increase its size from default 24 = 16 buckets to 25 = 32 buckets. Another way to calculate size: When the load factor ratio (m/n) reaches 0.75 at that time, hashmap increases its capacity.


1 Answers

Does it still take O(N) time for resizing a HashMap?

Basically, yes.

And a consequence is that an insertion operation that causes a resize will take O(N) time. But that happens on O(1/N) of all insertions, so (under certain assumptions) the average insertion time is O(1).

so could a good load factor affect this performance ? like better and faster than O(N)?

Choice of load factor affects performance, but not complexity.

If we make normal assumptions about the hash function and key clustering, when the load factor is larger:

  • the average hash chain length is longer, but still O(1),
  • frequency of resizes reduces, but is still O(1/N),
  • the cost of a resize remains about the same, and the complexity is still O(N).

... so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong.

Actually, you would need to rehash all of the keys. When you double the hash table size, the hash chains need to be split. To do this, you need to test which of two chains the hash value for every key maps to. (Indeed, you need to do the same if the hash table had an open organization too.)

However, in the current generation of HashMap implementations, the hashcode values are cached in the chained entry objects, so that the hashcode for a key doesn't ever need to be recomputed.


One comment mentioned the degenerate case where all keys hash to the same hashcode. That can happen either due to a poorly designed hash function, or a skewed distribution of keys.

This affects performance of lookup, insertion and other operations, but it does not affect either the cost or frequency of resizes.

like image 76
Stephen C Avatar answered Oct 20 '22 12:10

Stephen C