Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serializing and deserializing a map with key as string

I am intending to serialize and deserialize a hashmap whose key is a string.

From Josh Bloch's Effective Java, I understand the following. P.222

For example, consider the case of a hash table. The physical representation is a sequence of hash buckets containing key-value entries. Which bucket an entry is placed in is a function of the hash code of the key, which is not, in general guaranteed to be the same from JVM implementation to JVM implementation. In fact, it isn't even guaranteed to be the same from run to run on the same JVM implementation. Therefore accepting the default serialized form for a hash table would constitute a serious bug. Serializing and deserializing the hash table could yield an object whose invariants were seriously corrupt.

My questions are: 1) In general, would overriding equals and hashcode of the key class of the map resolve this issue and the map can be correctly restored?

2) If my key is a String and the String class is already overriding the hashCode() method, would I still have problem described above. (I am seeing a bug which makes me think this is probably still a problem even though the key is String with overriding hashCode.)

3)Previously, I got around this issue by serializing an array of entries (key, value) and when deserializing I would reconstruct the map. I am wondering if there is a better approach.

4) If the answers to question 1 and 2 are that it still can't be guaranteed, could someone explain why? If the hashCodes are the same would they go to the same buckets across JVMs?

Thanks, Grace

like image 344
Grace K Avatar asked Apr 30 '10 21:04

Grace K


3 Answers

The serialization form of java.util.HashMap doesn't serialize the buckets themselves, and the hash code is not part of the persisted state. From the javadocs:

Serial Data: The capacity of the HashMap (the length of the bucket array) is emitted (int), followed by the size of the HashMap (the number of key-value mappings), followed by the key (Object) and value (Object) for each key-value mapping represented by the HashMap The key-value mappings are emitted in the order that they are returned by entrySet().iterator().

from http://java.sun.com/j2se/1.5.0/docs/api/serialized-form.html#java.util.HashMap

The persisted state basically comprises the keys and values and some housekeeping. When deserialized, the hashmap is completely rebuilt; the keys are rehashed and placed in appropriate buckets.

So, adding String keys should work just fine. I would guess your bug lies elsewhere.

EDIT: Here's a junit 4 test case that serializes and deserializes a map, and minics VMs changing hashcodes. The test passes, despite the hashcodes being different after deserialization.

import org.junit.Assert;
import org.junit.Test;

import java.io.*;
import java.util.HashMap;

public class HashMapTest
{
    @Test
    public void testHashMapSerialization() throws IOException, ClassNotFoundException
    {
        HashMap map = new HashMap();
        map.put(new Key("abc"), 1);
        map.put(new Key("def"), 2);

        ByteArrayOutputStream out = new ByteArrayOutputStream();
        ObjectOutputStream objOut = new ObjectOutputStream(out);
        objOut.writeObject(map);
        objOut.close();
        Key.xor = 0x7555AAAA; // make the hashcodes different
        ObjectInputStream objIn = new ObjectInputStream(new ByteArrayInputStream(out.toByteArray()));
        HashMap actual = (HashMap) objIn.readObject();
        // now try to get a value
        Assert.assertEquals(2, actual.get(new Key("def")));
    }

    static class Key implements Serializable
    {
        private String  keyString;
        static int xor = 0;

        Key(String keyString)
        {
            this.keyString = keyString;
        }

        @Override
        public int hashCode()
        {
            return keyString.hashCode()^xor;
        }

        @Override
        public boolean equals(Object obj)
        {
            Key otherKey = (Key) obj;
            return keyString.equals(otherKey.keyString);
        }
    }

}
like image 181
mdma Avatar answered Oct 18 '22 09:10

mdma


I'm 99% sure that the JVM implementation of HashMap and HashSet handle this issue. They have a custom serialization and deserialization handler. I don't have Bloch's book in front of me now, but I believe he is explaining the challange, not saying that you can't reliably serialize a java.util.HashMap in practice.

like image 26
Yishai Avatar answered Oct 18 '22 09:10

Yishai


To serialize a hashmap:

I have tried this and used in my app it is working fine. Make a function of this code according to your need.

public static void main(String arr[])
{
    Map<String,String> hashmap=new HashMap<String,String>();
    hashmap.put("key1","value1");
    hashmap.put("key2","value2");
    hashmap.put("key3","value3");
    hashmap.put("key4","value4");

    FileOutputStream fos;
    try {
        fos = new FileOutputStream("c://list.ser");

        ObjectOutputStream oos = new ObjectOutputStream(fos);
        oos.writeObject(hashmap);
        oos.close();

        FileInputStream fis = new FileInputStream("c://list.ser");
        ObjectInputStream ois = new ObjectInputStream(fis);
        Map<String,String> anotherList = (Map<String,String>) ois.readObject();

        ois.close();

        System.out.println(anotherList);

    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ClassNotFoundException e) {
        e.printStackTrace();
    }

}
like image 28
Kumar Gaurav Avatar answered Oct 18 '22 07:10

Kumar Gaurav