String to Integer Hashing Function with Precision

Question

I want to hash a char array in to an int or a long. The resulting value has to adhere to a given precision value. The function I've been using is given below:

int GetHash(const char* zKey, int iPrecision /*= 6*/)
{
        /////FROM : http://courses.cs.vt.edu/~cs2604/spring02/Projects/4/elfhash.cpp

        unsigned long h = 0;
        long M = pow(10, iPrecision);

        while(*zKey)
        {
                h = (h << 4) + *zKey++;
                unsigned long g = h & 0xF0000000L;
                if (g) h ^= g >> 24;
                h &= ~g;
        }            

        return (int) (h % M);
}

The string to be hashed is similar to "SAEUI1210.00000010_1".

However, this produces duplicate values in some cases. Are there any good alternatives which wouldn't duplicate the same hash for different string values.

ASk · Accepted Answer

The very definition of a hash is that it produces duplicate values for some values, due to hash value range being smaller than the space of the hashed data.

In theory, a 32-bit hash has enough range to hash all ~6 character strings (A-Z,a-z,0-9 only), without causing a collision. In practice, hashes are not a perfect permutation of the input. Given a 32-bit hash, you can expect to get hash collisions after hashing ~16 bit of random inputs, due to the birthday paradox.

Given a static set of data values, it's always possible to construct a hash function designed specifically for them, which will never collide with itself (of course, size of its output will be at least log(|data set|). However, it requires you to know all the possible data values ahead of time. This is called perfect hashing.

That being said, here are a few alternatives which should get you started (they are designed to minimize collisions)

sharptooth · Answer

Every hash will have collisions. Period. That's called a Birthday Problem.

You may want to check cryptographic has functions like MD5 (relatively fast and you don't care that it's insecure) but it also will have collisions.

String to Integer Hashing Function with Precision

Tags:

c++

hash

Gayan

2 Answers

ASk

sharptooth

Recent Activity

Donate For Us

String to Integer Hashing Function with Precision

Tags:

c++

hash

Gayan

2 Answers

ASk

sharptooth

Related questions

Recent Activity

Donate For Us