Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast String Hashing Algorithm with low collision rates with 32 bit integer [closed]

I have lots of unrelated named things that I'd like to do quick searches against. An "aardvark" is always an "aardvark" everywhere, so hashing the string and reusing the integer would work well to speed up comparisons. The entire set of names is unknown (and changes over time). What is a fast string hashing algorithm that will generate small (32 or 16) bit values and have a low collision rate?

I'd like to see an optimized implementation specific to C/C++.

like image 680
Jason Citron Avatar asked Sep 22 '08 10:09

Jason Citron


People also ask

What is the fastest hashing algorithm?

SHA-1 is fastest hashing function with ~587.9 ms per 1M operations for short strings and 881.7 ms per 1M for longer strings. MD5 is 7.6% slower than SHA-1 for short strings and 1.3% for longer strings.

What are the 3 types of the hash collision algorithms?

Take into account the following hash algorithms – CRC-32, MD5, and SHA-1. These are common hash algorithms with varying levels of collision risk.

Which hashing algorithm is collision free?

In particular, cryptographic hash functions exhibit these three properties: They are “collision-free.” This means that no two input hashes should map to the same output hash. They can be hidden. It should be difficult to guess the input value for a hash function from its output.

Which string hashing is best?

If you just want to have a good hash function, and cannot wait, djb2 is one of the best string hash functions i know. it has excellent distribution and speed on many different sets of keys and table sizes. you are not likely to do better with one of the "well known" functions such as PJW, K&R[1], etc.


2 Answers

Murmur Hash is pretty nice.

like image 74
yrp Avatar answered Sep 27 '22 21:09

yrp


One of the FNV variants should meet your requirements. They're fast, and produce fairly evenly distributed outputs.

like image 31
Nick Johnson Avatar answered Sep 27 '22 20:09

Nick Johnson