Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would a higher load factor in HashMap increase lookup cost?

From JavaDoc of HashMap :

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).

If we have a higher value ,why would it increase the lookup cost ?

like image 238
Geek Avatar asked Aug 25 '12 12:08

Geek


2 Answers

Hash table's Load Factor is defined as

n/s, the ratio of the number of stored entries n and the size s of the table's array of buckets.

High performance of hash table is maintained when the number of collisions is low. When the load factor is high, the number of hash buckets needed to store the same number of entries remains lower, thus increasing the probability of collisions.

like image 154
Sergey Kalinichenko Avatar answered Oct 20 '22 20:10

Sergey Kalinichenko


Here we should first understand what capacity and load factor means:

capacity : this is number of buckets in any hash table at any given point in time.

load factor : The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased

so more the load factor is more occupied a hash table could get before the capacity is increased.

  • now given the best possible implementation of hashCode() only one value will go in one bucket here lookup cost will be minimum
  • in worst case all values will go in same bucket and lookup cost would be maximum
  • in an average case also, this will surely depend on hashCode() implementation but one more factor that will play here is load factor, as more occupied the collection will be, the more will be chances of collision and thus higher load factor will increase lookup cost in a non ideal scenario.
like image 21
Vivek Gupta Avatar answered Oct 20 '22 21:10

Vivek Gupta