Is there an optimal size for a hashtable related to the entry count?
So for entries = n
is there an optimal (or recommended) size s
for the hashtable which depends on n
? Lets say 2n
(double the entries count) or some other value?
Is it depending on the internal structure (hash function, bucket size, etc.)? Please provide some evidence when claiming something.
But a good general “rule of thumb” is: The hash table should be an array with length about 1.3 times the maximum number of keys that will actually be in the table, and. Size of hash table array should be a prime number.
Constructs a new, empty hashtable with a default initial capacity (11) and load factor (0.75).
They famously are only divisible by 1 and themselves. Thus, choosing to set your hash table length to a large prime number will greatly reduce the occurrence of collisions.
A bucket is simply a fast-access location (like an array index) that is the the result of the hash function. The idea with hashing is to turn a complex input value into a different value which can be used to rapidly extract or store data.
The ratio between the size of the table and the number of entries is called the load factor of a hash table.
The load factor crucially determines the expected runtime behaviour. For the usual bounds (i.e. expected time O(1) on all operations) to apply, it has to be smaller than 1.
In practice, the remark by Pete Wilson applies: one tries to keep the load factor close to 1 in order not to waste space; a prime number size for the table is often used to improve the collision characteristics of the hash function – but other strategies exist.
In java, with class HashTable, the default load factor (.75) offers a good tradeoff between time and space costs.
A higher load factor value decreases the space requirements and increases the odds of a collision. A collision increases the amount of time needed to perform a get() and put(...).
A lower load factor value increases disk/memory space requirements, causing lots of reserved space that is permanently unused. The increased number of bins decreases the odds of a collision.
So a load factor of (.75) means the HashTable bins are 75% full. If you have 75 elements to store, the number of bins in your HashTable should be 100.
Therefore, answering your question, given N as the number of items to store in your HashTable, the size of your HashTable should be about (1.33 * n). Other circumstances may make different load factors faster in some situations.
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Hashtable.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With