java String hashcode caching mechanism




Looking at Java's String class we can see that hash code is cached after first evaluation.

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        hash = h;
    return h;

Where hash is an instance variable. I have a question, why do we need that h extra variable?

like image 779
user3673623 Avatar asked Apr 27 '17 09:04


3 Answers

Simply because hash value changes in the loop and your solution without intermediate temporary variable is not thread-safe. Consider that this method is invoked in several threads.

Say thread-1 started hash computation and it is not 0 anymore. Some small moment later thread-2 invokes the same method hashCode() on the same object and sees that hash is not 0, but thread-1 hasn't yet finished its computation. As the result, in the thread-2 wrong hash (not fully computed) value will be used.

like image 103
Andremoniy Avatar answered Oct 07 '22 21:10


It's a simple and cheap synchronization mechanism.

If a thread invokes hashCode() for the first time and a second thread invokes it again while the first thread is calculating the hash, the second thread would return an incorrect hash (an intermediate value of the calculation in the first thread) if using directly the attribute.

like image 45
Mario Avatar answered Oct 07 '22 21:10


To put it very simple: local primitive h is well local; thus thread-safe; as opposed to hash which is shared.

like image 28
Eugene Avatar answered Oct 07 '22 21:10
