Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

function computing a hash number, what exactly does it do and why?

Tags:

c

hash

I know this function is doing something with hash numbers, but didnt understand exactly the purpose of this function? why "res *31 + *key"? why 31?

unsigned int HashAlg(char* key)
{
    unsigned int res = 0;

    while (*key != 0)
    {
        res = res * 31 + *key;
        ++key;
    }

    return res;
}
like image 448
Yuval Avatar asked Mar 19 '13 11:03

Yuval


People also ask

What is the purpose of a hash function?

Hash functions are used for data integrity and often in combination with digital signatures. With a good hash function, even a 1-bit change in a message will produce a different hash (on average, half of the bits change). With digital signatures, a message is hashed and then the hash itself is signed.

How does hashing work and why is it useful?

Hashing is the process of transforming any given key or a string of characters into another value. This is usually represented by a shorter, fixed-length value or key that represents and makes it easier to find or employ the original string. The most popular use for hashing is the implementation of hash tables.


1 Answers

The implementation is a variation of a multiplicative string hash function by D.J. Bernstein:

unsigned djb_hash ( void *key, int len )
{
  unsigned char *p = key;
  unsigned h = 0;
  int i;

  for ( i = 0; i < len; i++ )
    h = 33 * h + p[i];

  return h;
}

The purpose of hash functions like these is to map a search key, like the string "item1", to an index which can then be used in a hash table, a cache, etc.; simplistically, the hash value gives us the place in the table where the corresponding record for "item1" should be stored. Hash tables, in turn, are used to implement associative arrays and dynamic sets. For more detail I recommend starting at the Wikipedia page.

You can see that in your implementation, the constant 33 has been switched for 31. There isn't much real mathematical work which can definitively prove the relationship between prime numbers and hashing functions. The basic concept of using prime numbers in hash functions revolve around the concept of transforming the current state of the hash function (applying some form of mathematical operation such as multiplication or addition to the hash value). The result is constrained to be a new hash value that should statistically have a higher entropic value or in other words a very low bit-bias for any of the bits in the new hash value. In simple terms, when you multiply a set of random numbers by a prime number, the resulting numbers (when analyzed at a bit level) should show no bias towards being one state or another, i.e. P(Bi = 1) ~= 0.5. There is no concrete proof that this is the case or that it only happens with prime numbers, it just seems to be an ongoing self-proclaimed intuition that we seem obliged to follow. These properties are judged a-posteriori, meaning we try to analyze hash function (or PRNG) properties with chosen constants and develop an intuition which constants "work well", i.e. producing specific distributions or demonstrating an avalanche effect, produce a uniform distribution for a specific set of inputs, etc.

like image 67
Michael Foukarakis Avatar answered Sep 21 '22 16:09

Michael Foukarakis