Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Hashtable#hashCode() implementation broken? [closed]

I'm wondering if the default implementation of Java's Hashtable#hashCode() is broken when the Hashtable contains only entries with identical keys and values per pair.

See for example the following application:

public class HashtableHash {
    public static void main(final String[] args) {
        final Hashtable<String, String> ht = new Hashtable<String, String>();

        final int h1 = ht.hashCode();
        System.out.println(h1); // output is 0

        ht.put("Test", "Test");

        final int h2 = ht.hashCode();
        System.out.println(h2); // output is 0 ?!?

        // Hashtable#hashCode() uses this algorithm to calculate hash code
        // of every element:
        //
        // h += e.key.hashCode() ^ e.value.hashCode()
        //
        // The result of XOR on identical hash codes is always 0
        // (because all bits are equal)

        ht.put("Test2", "Hello world");

        final int h3 = ht.hashCode();
        System.out.println(h3); // output is some hash code
    }
}

The hash code for an empty Hashtable is 0. After an entry with the key "Test" and value "Test" has been added to the Hastable the hash code still is 0.

The problem is that in Hashtable's hashCode() method the hash code of every entry is calculated and added to the hash code as follows

h += e.key.hashCode() ^ e.value.hashCode()

However XOR on identical hash codes (which is the case for identical Strings) is always 0. So entries with identical keys and values are not part of the Hashtable's hash code.

This implementation is imho broken because the Hashtable actually has changed. It shouldn't matter if key and value are identical.

like image 972
Sven Jacobs Avatar asked Mar 19 '12 19:03

Sven Jacobs


3 Answers

From the documentation on hashCode;

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

In other words, bad implementation - perhaps. Broken - not according to the spec.

like image 55
Joachim Isaksson Avatar answered Sep 20 '22 17:09

Joachim Isaksson


It’s not broken, it’s working as designed and advertised. The hash code of two Maps being equal does not require the two Maps being equal.

like image 33
Bombe Avatar answered Sep 18 '22 17:09

Bombe


The only requirement of hashCode is that if two objects are equal, then their hash codes must be equal. Thus

public int hashCode() {
    return 123;
}

is perfectly valid, although not optimal.

like image 35
Steve Kuo Avatar answered Sep 20 '22 17:09

Steve Kuo