Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashMap serialization and deserialization changes

We are working with an in memory data grid (IMDG) and we have a migration tool. In order to verify that all the objects are migrated successfully, we calculate the chucksum of the objects from its serialized version.

We are seeing some problems with HashMap, where we serialize it, but when we deserialize it the checksum changes. Here is a simple test case:

@Test
public void testMapSerialization() throws IOException, ClassNotFoundException {
    TestClass tc1 = new TestClass();
    tc1.init();
    String checksum1 = SpaceObjectUtils.calculateChecksum(tc1);

    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutput out = null;
    byte[] objBytes = null;
    out = new ObjectOutputStream(bos);
    out.writeObject(tc1);
    objBytes = bos.toByteArray();
    out.close();
    ByteArrayInputStream bis = new ByteArrayInputStream(objBytes);
    ObjectInputStream in = new ObjectInputStream(bis);
    TestClass tc2 = (TestClass) in.readObject();
    String checksum2 = SpaceObjectUtils.calculateChecksum(tc2);

    assertEquals(checksum1, checksum2);
}

The TestClass looks like this:

class TestClass implements Serializable {
    private static final long serialVersionUID = 5528034467300853270L;

    private Map<String, Object> map;

    public TestClass() {
    }

    public Map<String, Object> getMap() {
        return map;
    }

    public void setMap(Map<String, Object> map) {
        this.map = map;
    }

    public void init() {
        map = new HashMap<String, Object>();
        map.put("name", Integer.valueOf(4));
        map.put("type", Integer.valueOf(4));
        map.put("emails", new BigDecimal("43.3"));
        map.put("theme", "sdfsd");
        map.put("notes", Integer.valueOf(4));
        map.put("addresses", Integer.valueOf(4));
        map.put("additionalInformation", new BigDecimal("43.3"));
        map.put("accessKey", "sdfsd");
        map.put("accountId", Integer.valueOf(4));
        map.put("password", Integer.valueOf(4));
        map.put("domain", new BigDecimal("43.3"));
    }
}

And this is the method to calculate the checksum:

public static String calculateChecksum(Serializable obj) {
    if (obj == null) {
        throw new IllegalArgumentException("The object cannot be null");
    }
    MessageDigest digest = null;
    try {
        digest = MessageDigest.getInstance("MD5");
    } catch (java.security.NoSuchAlgorithmException nsae) {
        throw new IllegalStateException("Algorithm MD5 is not present", nsae);
    }
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutput out = null;
    byte[] objBytes = null;
    try {
        out = new ObjectOutputStream(bos);
        out.writeObject(obj);
        objBytes = bos.toByteArray();
        out.close();
    } catch (IOException e) {
        throw new IllegalStateException(
                "There was a problem trying to get the byte stream of this object: " + obj.toString());
    }
    digest.update(objBytes);
    byte[] hash = digest.digest();
    StringBuilder hexString = new StringBuilder();
    for (int i = 0; i < hash.length; i++) {
        String hex = Integer.toHexString(0xFF & hash[i]);
        if (hex.length() == 1) {
            hexString.append('0');
        }
        hexString.append(hex);
    }
    return hexString.toString();
}

If you print the maps of tc1 and tc2, you can see that the elements are not in the same place:

{accessKey=sdfsd, accountId=4, theme=sdfsd, name=4, domain=43.3, additionalInformation=43.3, emails=43.3, addresses=4, notes=4, type=4, password=4}
{accessKey=sdfsd, accountId=4, name=4, theme=sdfsd, domain=43.3, emails=43.3, additionalInformation=43.3, type=4, notes=4, addresses=4, password=4}

I would like to be able to serialize the HashMap and get the same checksum when I deserialize it. Do you know if there is a solution or if I'm doing something wrong?

Thanks!

Diego

like image 721
dgaviola Avatar asked May 13 '11 14:05

dgaviola


2 Answers

You are doing nothing wrong, it just can't be done with a HashMap. In a HashMap, order is not guaranteed. Use a TreeMap instead.

Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

Source: Hashmap

like image 88
Sean Patrick Floyd Avatar answered Sep 28 '22 00:09

Sean Patrick Floyd


Your check sum cannot depend on the order of entries as HashMap is not ordered. An alternative to using TreeMap is LinkedHashMap (which retains an order), but the real solution is to use a hashCode which doesn't depending on the order of the entries.

like image 38
Peter Lawrey Avatar answered Sep 28 '22 01:09

Peter Lawrey