Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why chose 31 to do the multiplication in the hashcode() implementation ? [duplicate]

Possible Duplicate:
Why does Java's hashCode() in String use 31 as a multiplier?
Why use a prime number in hashCode?

From Effective Java Item 9: Always override hashCode when you override equals consider the following relevant code snippet that overrides the hashcode() defined in Object class.

public final class PhoneNumber {
  private final short areaCode;
  private final short prefix;
  private final short lineNumber;
  ......//Rest ignored on purpose
  .....

private volatile int hashCode; // (See Item 71)
@Override public int hashCode() {
   int result = hashCode;
   if (result == 0) {
    result = 17;
    result = 31 * result + areaCode;
    result = 31 * result + prefix;
    result = 31 * result + lineNumber;
    hashCode = result;
   }
 return result;
 }
}

I understand why a non zero initial value "17" is chosen . I also understand the that multiplication is done in each step so that the order of fields play an important role in computing hashcode(). But I do not understand the reason for choosing 31 as the value for multiplication . Can some one explain to me why ? This is what Bloch has to say about 31 but I do not really get it . I specifically fail to understand the line in italics below.

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional.

like image 295
Geek Avatar asked Jul 18 '12 08:07

Geek


2 Answers

Shifting left just introduces a zero on the right and loses a bit on the left of the number's binary representation, so it's a clear information loss. Repeating this process gradually loses all information that was accumulated from earlier computation. That means that the more fields enter your hashcode calculation, the less effect on the final result the early fields have.

like image 63
Marko Topolnik Avatar answered Oct 16 '22 13:10

Marko Topolnik


he reason for using a prime is that it is more likely to produce a random pattern. If you use 9 for example you can get over lap with multiples of 3.

AFAIK 31 is used for Strings as there is less than 31 letters in the alphabet meaning all words of up to 6 letters have a unique hash code. If you use 61 (prime less than 64) for example, up to 5 letters would produce unique codes and if you use 13 (prime less than 16) you can get collisions with two letters words.

like image 31
Peter Lawrey Avatar answered Oct 16 '22 13:10

Peter Lawrey