I was wondering how to calculate the hash code for a given string by hand. I understand that in Java, you can do something like:
String me = "What you say what you say what?";
long whatever = me.hashCode();
That's all good and dandy, but I was wondering how to do it by hand. I know the given formula for calculating the hash code of a string is something like:
S0 X 31 ^ (n-1) + S1 X 31 ^ (n-2) + .... + S(n-2) X 31 + S(n-1)
Where S indicates the character in the string, and n is the length of the string. Using 16 bit unicode then, the first character from string me would be computed as:
87 X (31 ^ 34)
However, that creates an insanely large number. I can't imagine adding all the characters together like that. So, in order to calculate the lowest-order 32 bits result then, what would I do? Long whatever from above equals -957986661 and I'm not how to calculate that?
Take a look at the source code of java.lang.String
.
/**
* Returns a hash code for this string. The hash code for a
* <code>String</code> object is computed as
* <blockquote><pre>
* s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
* </pre></blockquote>
* using <code>int</code> arithmetic, where <code>s[i]</code> is the
* <i>i</i>th character of the string, <code>n</code> is the length of
* the string, and <code>^</code> indicates exponentiation.
* (The hash value of the empty string is zero.)
*
* @return a hash code value for this object.
*/
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Most hash functions of this sort calculate the hash value modulo some large number (e.g. a large prime). This avoids overflows and keeps the range of values returned by the function within a specified range. But this also means an infinite range of input values will get a hash value from a finite set of possible values (i.e. [0,modulus)), hence the problem of hash collisions.
In this case, the code would look something like this:
public int hash(String x){
int hashcode=0;
int MOD=10007;
int shift=29;
for(int i=0;i<x.length();i++){
hashcode=((shift*hashcode)%MOD+x.charAt(i))%MOD;
}
return hashcode;
}
Exercise for the reader:
See the code for the hashCode
function for java.util.String. Can you see why it does not use a modulus explicitly?
The following statements will find the string hashCode
String str="Hi";
int a = str.hashCode();//returns 2337
Let's check how exactly its calculated
HashCode = s[0]*31(n-1) + s[1]*31(n-2) + .. s(n-2)
As we all know that the character at position 0 is H, Character at position 1 is i, and the string length is 2.
==> H*31(2-1) + i*31(2-2)
As we all know that, ASCII code of H is 72, and i is 105. It means,
==> 72 * 31 + 105 * 1 (Anything Power 0 is 1)
==> 2232 + 105 = 2337
Source: https://www.tutorialgateway.org/find-string-hashcode-in-java/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With