Looking for some hash function to make string to int mapping with following restrictions.
restrictions: Same strings go to same number. Different strings go to different numbers. During one run of application I am getting strings from same length, only in the runtime I know the length.
Any suggestions how to create the hash function ?
A hash function does never guarantee that two different values (strings in your case) yield different hash codes. However, same values will always yield the same hash codes.
This is because information gets lost. If you have a string of a length of 32 characters, it will have 64 bytes (2 bytes per char). An int
hash code has four bytes. This is inevitable and is called a collision.
Note: Dictionary<Tkey,TValue>
uses a hash table internally. Therfore it implements a collision resolution strategy. See An Extensive Examination of Data Structures Using C# 2.0 on MSDN.
Here is the current implementation of dictionary.cs.
You aren't going to find a hash algorithm that guarantees that the same integer won't be returned for different strings. By definition, hash algorithms have collisions. There are far more possible strings in the world than there are possible 32-bit integers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With