Java HashMap put() implementation. Why not check references first?

Question

java.util.HashMap has an implementation of the put method, which has the following code inside it :

if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
    V oldValue = e.value;
    e.value = value;
    e.recordAccess(this);
    return oldValue;
}

In the above code why wasn't the reference check made first (since two objects having the same reference will have the same hash and equals()) ?

i.e. something like this :

if ((k = e.key) == key) {
    V oldValue = e.value;
    e.value = value;
    e.recordAccess(this);
    return oldValue;
} else if ( compare hash and equals) {
    // do something again with the value
}

Wouldn't this have saved a comparision?

jarnbjo · Accepted Answer

I don't know why, but a naïve microbenchmark suggests that on Oracle's VM (Intel, 32 or 64 bit), comparing two references takes about four times as long as comparing two ints (as in hash codes). I would have assumed that comparing two 32-bit integers and two address pointers should have had similar runtime cost on modern hardware, but I am probably just not considering something obvious here.

Assuming that different keys in most cases have different hash codes, comparing the hash before the key saves 75% runtime for each incorrect key and adds 25% runtime for the correct key. If this actually saves overall runtime depends of course on the exact content and layout of the hash map tables, but the Sun engineers obviously thought that this variant is better for most purposes.

Methods used for benchmarking:

public static int c1(int a, int b, int iter) {
    int r = 0;
    while((iter--)>0) {
        if(a == b) {
            r++;
        }
    }
    return r;
}

public static int c2(Object a, Object b, int iter) {
    int r = 0;
    while((iter--)>0) {
        if(a == b) {
            r++;
        }
    }
    return r;
}

Damian Leszczyński - Vash · Answer

The operations if_icmpne (compare of integer) and if_acmpne (compare of reference) use the same technique to obtain result [1,2,3,4].

Both have prepared values on stack and consumes form it equally. There should not be significant difference in operations required. Both will be done in single CPU cycle.

In order to state that map can store the object in the same bucket there must be validate two rules.

Their hashcode must be equal
The must return true when x.equals(y)

IMHO the code reflect those rules and i could be written as

if (e.hash == hash && key.equals(k))

In order to satisfy map requirement we must always validate hashes and equals.

So for performance reason the part key.equals(k) was optimized with (k = e.key) == key giving

((k = e.key) == key || key.equals(k))

This implementation mean that for maps we valuate more hashes and equals as then reference equality. So this is expected behaviour.

Java HashMap put() implementation. Why not check references first?

Tags:

java

standard-library

Ace McCloud

2 Answers

jarnbjo

Damian Leszczyński - Vash

Recent Activity

Donate For Us

Java HashMap put() implementation. Why not check references first?

Tags:

java

standard-library

Ace McCloud

2 Answers

jarnbjo

Damian Leszczyński - Vash

Related questions

Recent Activity

Donate For Us