Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why 5 bit left shift in hashing function?

A popular answer for generating a hashing function in JS is given in Simple (non-secure) hash function for JavaScript? and Generate a Hash from string in Javascript

One example of the code is:

String.prototype.hashCode = function() {
    var hash = 0;
    if (this.length == 0) {
        return hash;
    }
    for (var i = 0; i < this.length; i++) {
        var char = this.charCodeAt(i);
        hash = ((hash<<5)-hash)+char;
        hash = hash & hash; // Convert to 32bit integer
    }
    return hash;
}

One line that doesn't make sense to me is hash = ((hash<<5)-hash)+char;

Could someone please explain WHY this is done? I gather that we are doing a 5 bit left shift on the hash. Is there any reason why it is 5 bits and not 4 or 6? Also why do we then minus the hash and add the char?

like image 910
TomDane Avatar asked Aug 22 '18 05:08

TomDane


People also ask

How does left shift operator work?

The left-shift operator causes the bits in shift-expression to be shifted to the left by the number of positions specified by additive-expression . The bit positions that have been vacated by the shift operation are zero-filled.

How is hash key calculated?

With modular hashing, the hash function is simply h(k) = k mod m for some m (usually, the number of buckets). The value k is an integer hash code generated from the key. If m is a power of two (i.e., m=2p), then h(k) is just the p lowest-order bits of k.

How does Left Shift and Right Shift Work?

Left Shift and Right Shift Operators in C/C++ Takes two numbers, left shifts the bits of the first operand, the second operand decides the number of places to shift. Or in other words left shifting an integer “x” with an integer “y” denoted as '(x<<y)' is equivalent to multiplying x with 2^y (2 raised to power y).


1 Answers

(hash << 5) is (hash * 32), so ((hash << 5) - hash) is (hash * 31). And the reason for multiplication by 31 is described in the answers to question Why does Java's hashCode() in String use 31 as a multiplier?

So, if this is changed to (hash * 31), the result is the same. Possibly (hash << 5) - hash is slightly faster, as shift / subtraction might be faster than multiplication. However, if that's really the case depends on many factors (whether JIT compilation is used, and optimizations in the JIT, and even the processor). So I assume the author of the code tested it, and found it to be faster for his case.

like image 193
Thomas Mueller Avatar answered Oct 08 '22 15:10

Thomas Mueller