Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hashCode implementation strategies

Tags:

java

hashcode

Some major JVM classes (such as String or List implementations) implement equals by returning Σ 31^n * field_n.hashCode() for every field_n that is relevant for the equals method. Beside, this approach is recommended by Joshua Bloch in Effective Java (item 9).

However, other classes such as Map.Entry implementations follow different rules. For example, the Map.Entry documentation states that the hash code of a Map.Entry should be

 (e.getKey()==null   ? 0 : e.getKey().hashCode()) ^
 (e.getValue()==null ? 0 : e.getValue().hashCode())

This can sometimes be impractical to use in hash tables, since:

  • the hash code of all entries which have the same key and value is 0,
  • two entries e1 and e2 so that e1.key = e2.value and e1.value = e2.key have the same hash code.

Why did Java choose this implementation specification for Map.Entry hashCode instead of, for example, 31 * (e.getKey()==null ? 0 : e.getKey().hashCode()) + (e.getValue()==null ? 0 : e.getValue().hashCode())?

Edit 1:

To help figure out the problem, here is an example of useful code where the result has very poor performance due to hash collisions if many entries have the same key and value.

This method computes the frequencies of the entries of different maps (using Guava's Multiset).

public static <K, V> Multiset<Map.Entry<K, V>> computeEntryCounts(
        Iterable<Map<K, V>> maps) {
    ImmutableMultiset.Builder<Map.Entry<K, V>> result = ImmutableMultiset.builder();
    for (Map<K, V> map : maps) {
        for (Map.Entry<K, V> entry : map.entrySet()) {
            result.add(entry);
        }
    }
    return result.build();
}
like image 490
jpountz Avatar asked Oct 09 '22 05:10

jpountz


1 Answers

I doubt there's a good reason — I think it's just an oversight — but it's not a serious problem. It would only come up if you had a HashSet<Map.Entry<T,T>> or a HashMap<Map.Entry<T,T>,V>, which is not commonly done. (Edited to add: Or, as Joachim Sauer points out below, a HashSet<Map<T,T>> or a HashMap<Map<T,T>,V> — also not commonly done.)

Note that HashMap<K,V> does not use Map.Entry<K,V>.hashCode(), because it looks up entries by their keys only, so it just uses K.hashCode().

like image 164
ruakh Avatar answered Oct 12 '22 09:10

ruakh