Possible Duplicate:
Why should hash functions use a prime number modulus?
Why is it necessary for a hash table's (the data structure) size to be a prime?
From what I understand, it assures a more even distribution but is there any other reason?
The only reason is to avoid clustering of values into a small number of buckets (yes, distribution). A more even distributed hashtable will perform more consistently.
from http://srinvis.blogspot.com/2006/07/hash-table-lengths-and-prime-numbers.html
If suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x...}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial to verify/derive this). Now you can do one of the following to avoid clustering
Make sure that you don't generate too many hashCodes that are multiples of another hashCode like in {x, 2x, 3x, 4x, 5x, 6x...}.But this may be kind of difficult if your hashTable is supposed to have millions of entries.
Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.
Update: (from original answer author)
This answer is correct for a common implementation of a hash table, including the Java implementation of the original Hashtable
as well as the current implementation of .NET's Dictionary
.
Both the answer and the supposition that capacity should be prime are not accurate for Java's HashMap
though. The implementation of a HashMap
is very different and utilizes a table with a size of base 2 to store buckets and uses n-1 & hash
to calculate which bucket to use as opposed to the more traditional hash % n
formula.
Java's HashMap
will force the actual used capacity to be the next largest base 2 number above the requested capacity.
Compare Hashtable
:
int index = (hash & 0x7FFFFFFF) % tab.length
https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/util/Hashtable.java#L364
To HashMap
:
first = tab[(n - 1) & hash]
https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/util/HashMap.java#L569
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With