function computing a hash number, what exactly does it do and why?

Tags:

1 Answers

The implementation is a variation of a multiplicative string hash function by D.J. Bernstein:

unsigned djb_hash ( void *key, int len )
{
  unsigned char *p = key;
  unsigned h = 0;
  int i;

  for ( i = 0; i < len; i++ )
    h = 33 * h + p[i];

  return h;
}

The purpose of hash functions like these is to map a search key, like the string "item1", to an index which can then be used in a hash table, a cache, etc.; simplistically, the hash value gives us the place in the table where the corresponding record for "item1" should be stored. Hash tables, in turn, are used to implement associative arrays and dynamic sets. For more detail I recommend starting at the Wikipedia page.

You can see that in your implementation, the constant 33 has been switched for 31. There isn't much real mathematical work which can definitively prove the relationship between prime numbers and hashing functions. The basic concept of using prime numbers in hash functions revolve around the concept of transforming the current state of the hash function (applying some form of mathematical operation such as multiplication or addition to the hash value). The result is constrained to be a new hash value that should statistically have a higher entropic value or in other words a very low bit-bias for any of the bits in the new hash value. In simple terms, when you multiply a set of random numbers by a prime number, the resulting numbers (when analyzed at a bit level) should show no bias towards being one state or another, i.e. P(Bi = 1) ~= 0.5. There is no concrete proof that this is the case or that it only happens with prime numbers, it just seems to be an ongoing self-proclaimed intuition that we seem obliged to follow. These properties are judged a-posteriori, meaning we try to analyze hash function (or PRNG) properties with chosen constants and develop an intuition which constants "work well", i.e. producing specific distributions or demonstrating an avalanche effect, produce a uniform distribution for a specific set of inputs, etc.

answered Sep 21 '22 16:09

Michael Foukarakis

Related questions
                            
                                Extend a dynamic linked shared library?
                            
                                Creating a simple HTTP proxy in C [closed]
                            
                                Why does ZeroMQ connect return 0 even when no server listening?
                            
                                How to set up sigaltstack correctly?
                            
                                pipe call and synchronization
                            
                                C: Thread Safe Logging to a File
                            
                                Casting initializers down to a pointer
                            
                                deep copy of struct with Pointer Point in C
                            
                                RSA_generate_key() using prngd instead of /dev/random or /dev/urandom
                            
                                Need to multiply one XMM register by another, but with bit masked value
                            
                                Why the interface sqlite3_get_table in SQLite C Interface is not recommended
                            
                                HDF5 Compound type Native vs. IEEE
                            
                                Count blocks allocated with malloc() from external program
                            
                                Can I use the C-preprocssor to convert an integer to a string? [duplicate]
                            
                                Why am I forking more than 5 times here?
                            
                                How does gcc decide the wide character set when calling `mbtowc()`?
                            
                                Influencing function cloning/duplication/constant propagation in gcc
                            
                                Converting IP Address input by using inet_ntop() & inet_pton() (C PROGRAMMING)
                            
                                Prediction of the next number generated by C (glibc) rand()
                            
                                Is it possible to define another preprocessor directive?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

function computing a hash number, what exactly does it do and why?

Tags:

c

hash

Yuval

People also ask

1 Answers

Michael Foukarakis

Recent Activity

Donate For Us