Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hash that maps strings to integers

Looking for some hash function to make string to int mapping with following restrictions.

restrictions: Same strings go to same number. Different strings go to different numbers. During one run of application I am getting strings from same length, only in the runtime I know the length.

Any suggestions how to create the hash function ?

like image 567
Night Walker Avatar asked Dec 21 '22 03:12

Night Walker


2 Answers

A hash function does never guarantee that two different values (strings in your case) yield different hash codes. However, same values will always yield the same hash codes.

This is because information gets lost. If you have a string of a length of 32 characters, it will have 64 bytes (2 bytes per char). An int hash code has four bytes. This is inevitable and is called a collision.

Note: Dictionary<Tkey,TValue> uses a hash table internally. Therfore it implements a collision resolution strategy. See An Extensive Examination of Data Structures Using C# 2.0 on MSDN.

Here is the current implementation of dictionary.cs.

like image 66
Olivier Jacot-Descombes Avatar answered Dec 24 '22 01:12

Olivier Jacot-Descombes


You aren't going to find a hash algorithm that guarantees that the same integer won't be returned for different strings. By definition, hash algorithms have collisions. There are far more possible strings in the world than there are possible 32-bit integers.

like image 33
Robert Levy Avatar answered Dec 24 '22 00:12

Robert Levy