Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Map::hashCode() collision - why?

The following code results in the same hash code being generated for the two maps, any ideas?


import java.util.HashMap;
import java.util.Map;

public class Foo
{
    @SuppressWarnings("unchecked")
    public static void main (String[] args)
    {
        Map map;

        map = new HashMap();

        map.put("campaignId", 4770L);
        map.put("location", "MINI_PROFILE");
        map.put("active", "true");
        map.put("lazy", true);

        System.out.println(map.hashCode());

        map = new HashMap();

        map.put("campaignId", 4936L);
        map.put("location", "MINI_PROFILE");
        map.put("active", "true");
        map.put("lazy", false);

        System.out.println(map.hashCode());


    }
}

The result is:

-1376467648
-1376467648

Simply changing the key names is enough to make the code generate two different hash codes.

like image 425
Hisham Avatar asked Dec 02 '22 04:12

Hisham


2 Answers

Simply coincidence, I suspect... there are bound to be collisions, and in this case it looks like the relevant different bits in the first value are being lost, effectively.

However, it shouldn't make any difference - anything using hash codes must cope with collisions.

EDIT: It's just the way the hashes happen to be calculated. This code shows what's going on:

import java.util.*;

public class Test
{
    @SuppressWarnings("unchecked")
    public static void main (String[] args)
    {
        AbstractMap.SimpleEntry[] entries = {
            new AbstractMap.SimpleEntry("campaignId", 4770L),
            new AbstractMap.SimpleEntry("campaignId", 4936L),
            new AbstractMap.SimpleEntry("lazy", true),
            new AbstractMap.SimpleEntry("lazy", false)
        };
        for (AbstractMap.SimpleEntry entry : entries) {
            System.out.println(entry + ": " + entry.hashCode());
        }
    }
}

Results:

campaignId=4770: -1318251287
campaignId=4936: -1318251261
lazy=true: 3315643
lazy=false: 3315617

So in one pair the first map has a hash 26 less than the second map, and in another pair the first map has a hash 26 more than the second map.

AbstractMap just sums hash values (one way of making sure that ordering is irrelevant) so the two end up with the same hash code.

It's really down to Boolean.hashCode() which looks like this:

return value ? 1231 : 1237;

... and Long.hashCode() which looks like this:

return (int)(value ^ (value >>> 32));

Given the values they happened to pick in Boolean.hashCode(), if your long values are only 26 apart (or 26 * 2^32 apart) then you'll run into the same thing.

like image 140
Jon Skeet Avatar answered Dec 19 '22 00:12

Jon Skeet


I think this is just a coincidence. From the Javadoc for AbstractMap#hashCode():

The hash code of a map is defined to be the sum of the hash codes of each entry in the map's entrySet() view.

And for Entry#hashCode():

Returns the hash code value for this map entry. The hash code of a map entry e is defined to be:

 (e.getKey()==null   ? 0 : e.getKey().hashCode()) ^
 (e.getValue()==null ? 0 : e.getValue().hashCode())

So hash codes for maps are based on both the keys AND the values contained in the map. You're just experiencing a odd situation where two maps have the same hash code, with no apparent reason.

like image 32
Vivien Barousse Avatar answered Dec 19 '22 02:12

Vivien Barousse