I am currently playing around with hashing and key generation trying to make my own hash key generator.
At the moment I have a list of ~90000 strings (each 1 word and a different word). I was wondering what the best way to generate keys (number keys not string keys) would be?
Currently depending on the words last ascii character I do a calculation based on the value of the letter.
The result is about 50% of the words generate a key that clashes with another.
I have used quadratic probing to then find space in the table for the rest of the words.
My question, as above, is what is generally the best sort of way to generate a key for 90000 different words? I know that the larger the data set, the more likely there will be clashes, but how would you suggest/or minimise the clashes?
Edit: Also - I don't care about cryptography, it just needs to be fast.
Thanks.
You can "borrow" Java's implementation of String
's hashCode
*:
int hashCode(const char* s) {
int h = 0;
while (*s) {
h = 31*h + (*s++);
}
return h;
}
This function achieves a reasonable separation, and is among the most widely used hash functions out there.
* which, as it turns out, Java in turn "borrowed" from Kernighan & Ritchie's book on C programming.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With