What is the best 32bit hash function for relatively short strings?
Strings are tag names that consist of English letters, numbers, spaces and some additional characters (#
, $
, .
, ...). For example: Unit testing
, C# 2.0
.
I am looking for 'best' as in 'minimal collisions', performance is not important for my goals.
The 32-bit long hash value is a hexadecimal number of 8 characters. MD4 is a Message Digest Algorithm developed by Ronald L. Rivest from RSA Data Security, Inc. Currently it's considered insecure, but it's very fast on 32-bit mashines and it's used for calculating EDonkey 2000 hashes in the EDonkey p2p network.
Probably the one most commonly used is SHA-256, which the National Institute of Standards and Technology (NIST) recommends using instead of MD5 or SHA-1. The SHA-256 algorithm returns hash value of 256-bits, or 64 hexadecimal digits.
A good hash function to use with integer key values is the mid-square method. The mid-square method squares the key value, and then takes out the middle r bits of the result, giving a value in the range 0 to 2r−1. This works well because most or all bits of the key value contribute to the result.
Common attacks like brute force attacks can take years or even decades to crack the hash digest, so SHA-2 is considered the most secure hash algorithm.
I'm not sure if it's the best choice, but here is a hash function for strings:
The Practice of Programming (HASH TABLES, pg. 57)
/* hash: compute hash value of string */ unsigned int hash(char *str) { unsigned int h; unsigned char *p; h = 0; for (p = (unsigned char*)str; *p != '\0'; p++) h = MULTIPLIER * h + *p; return h; // or, h % ARRAY_SIZE; }
Empirically, the values 31 and 37 have proven to be good choices for the multiplier in a hash function for ASCII strings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With