Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will the hashcode of a String will be the same for the Entire Application?

Tags:

java

I am working an a Hashing based program . My question is will the HashCode of a String will remain the same for the entire application .

The reason i was asking this because , the KetamaMemcachedSessionLocator inside Mecached Servers works this way If there are two servers on which Memcache is running , i want to locate a key from a Particular server .

String key = "MyString";
int keyid = key.hashCode();
int v = keyid % 1;  //( I assume that this will contact the First Server to retrieve that value )
int v = keyid % 2;  //( I assume that this will contact the Second Server to retrieve that value )
String value = MemcachedClient.get(key, v);

Followed to implement the above based on this website

http://dev.mysql.com/doc/refman/5.0/en/ha-memcached-using-hashtypes.html

please share your views , incase if you find any issues if the above way it works .

like image 860
Pawan Avatar asked Oct 28 '12 06:10

Pawan


2 Answers

According to hashcode contract it will always the same if string1.eqauls(string2)

The java.lang.String hash function

In an attempt to provide a fast implementation, early versions of the Java String class provided a hashCode() implementation that considered at most 16 characters picked from the string. For some common data this worked very poorly, delivering unacceptably clustered results and consequently slow hashtable performance.

From Java 1.2, java.lang.String class implements its hashCode() using a product sum algorithm over the entire text of the string. Given an instance s of the java.lang.String class, for example, would have a hash code h(s) defined by

 h(s)=\sum_{i=0}^{n-1}s[i] \cdot 31^{n-1-i}

where terms are summed using Java 32-bit int addition, s[i] denotes the ith character of the string, and n is the length of s.

As with any general hashing function, collisions are possible. For example, the strings "FB" and "Ea" have the same hash value. The hashCode() implementation of String uses the prime number 31 and the difference between 'a' and 'B' is just 31, so the calculation is 70 × 31 + 66 = 69 × 31 + 97.

Check Collections Framework Enhancements in Java SE 7 as you see there are changes in it and who knows will be.

The alternative hash function is only applied to keys of type String.

like image 150
Amit Deshpande Avatar answered Oct 15 '22 22:10

Amit Deshpande


Yes and no.

The hashCode() contract specifies that two equal strings will have the same hash code within the same JVM. That means that the code will not change as long as the string does not change.

On the other hand, the actual hashCode() implementation has changed from one JVM version to another and/or from one JVM vendor to another. For example, Oracle Java 7u6 provides a faster alternative hashing function for strings that are above a certain size. Currently it is only used within the Collections framework, but it could very well become a system-wide default with Java 8.

Basically, you can rely on hashCode() being consistent within the same application, but not between different application instances. If you intend on storing or sharing hash codes, you should probably implement your own functions.

Another potential point of interest is that hashCode() as defined in Java is an int i.e. 32-bits long. That is by no means a unique identifier - collisions are quite frequent and the programmer is expected to handle them. If your storage system depends on unique keys you might want to use a stronger hashing function, such as SHA-2, anyway.

like image 34
thkala Avatar answered Oct 15 '22 21:10

thkala