Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What precisely is the algorithm used by java.lang.Object's hashCode

Tags:

java

openjdk

What is the algorithm used in the JVM to implement java.lang.Object's implicit hashCode() method?

[OpenJDK or Oracle JDK are preferred in the answers].

like image 219
djhaskin987 Avatar asked Jul 23 '14 22:07

djhaskin987


People also ask

What is algorithm used for hashCode in Java?

An efficient hashing algorithm generates distinct hashcodes for unequal objects. The hashCode method should be consistent in its implementation of the equals() function. hashCode() and hashCode(int value) methods, defined in the Integer class, compute the hash codes of integer objects or values in Java.

What is hashCode algorithm?

Simply put, hashCode() returns an integer value, generated by a hashing algorithm. Objects that are equal (according to their equals()) must return the same hash code. Different objects do not need to return different hash codes.

What is hashCode of object in Java?

A hash code is an integer value that is associated with each object in Java. Its main purpose is to facilitate hashing in hash tables, which are used by data structures like HashMap.

Which method is used to get the hashCode of the given object?

To get this hashcode value for an object, we can use the hashcode() method in Java. It is the means hashcode() method that returns the integer hashcode value of the given object. Since this method is defined in the Object class, hence it is inherited by user-defined classes also.


1 Answers

It's implementation dependant (and heavily so, the algorithm is entirely up to the implementation, as long as it's consistent.) However, as per the answer here, you can see the native source file where the hash is generated in OpenJDK 7 (look at the get_next_hash() function), which actually specifies a number of possible algorithms in this particular release:

// Possibilities:
// * MD5Digest of {obj,stwRandom}
// * CRC32 of {obj,stwRandom} or any linear-feedback shift register function.
// * A DES- or AES-style SBox[] mechanism
// * One of the Phi-based schemes, such as:
//   2654435761 = 2^32 * Phi (golden ratio)
//   HashCodeValue = ((uintptr_t(obj) >> 3) * 2654435761) ^ GVars.stwRandom ;
// * A variation of Marsaglia's shift-xor RNG scheme.
// * (obj ^ stwRandom) is appealing, but can result
//   in undesirable regularity in the hashCode values of adjacent objects
//   (objects allocated back-to-back, in particular).  This could potentially
//   result in hashtable collisions and reduced hashtable efficiency.
//   There are simple ways to "diffuse" the middle address bits over the
//   generated hashCode values
//

As already stated, the default algorithm is to simply go with a random number, however an interesting comment further down states:

// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.

This is similar to the (obj ^ stwRandom) approach mentioned in the above list of possibilities, but it gets around the unwanted regularities associated with objects allocated quickly in succession by tying "thread-specific state" information into the hash also - so if two objects happened to be allocated at a very similar time due to them being allocated on two separate threads executing simultaneously, the discrete threading information fed into the hash should still ensure discrete enough hashes will be generated.

like image 183
Michael Berry Avatar answered Nov 14 '22 23:11

Michael Berry