Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serializable, cloneable and memory use in Java

I am using an inner class that is a subclass of a HashMap. I have a String as the key and double[] as the values. I store about 200 doubles per double[]. I should be using around 700 MB to store the keys, the pointers and the doubles. However, memory analysis reveals that I need a lot more than that (a little over 2 GB).

Using TIJmp (profiling tool) I saw there was a char[] that was using almost half of the total memory. TIJmp said that char[] came from Serializable and Cloneable. The values in it ranged from a list of fonts and default paths to messages and single characters.

What is the exact behavior of Serializable in the JVM? Is it keeping a "persistent" copy at all times thus, doubling the size of my memory footprint? How can I write binary copies of an object at runtime without turning the JVM into a memory hog?

PS: The method where the memory consumption increases the most is the one below. The file has around 229,000 lines and 202 fields per line.

public void readThetas(String filename) throws Exception
{
    long t1 = System.currentTimeMillis();
    documents = new HashMapX<String,double[]>(); //Document names to indices.
    Scanner s = new Scanner(new File(filename));
    int docIndex = 0;
    if (s.hasNextLine())
        System.out.println(s.nextLine()); // Consume useless first line :)
    while(s.hasNextLine())
    {
        String[] fields = s.nextLine().split("\\s+");
        String docName = fields[1];
        numTopics = fields.length/2-1;
        double[] thetas = new double[numTopics];
        for (int i=2;i<numTopics;i=i+2)
            thetas[Integer.valueOf(fields[i].trim())] = Double.valueOf(fields[i+1].trim());
        documents.put(docName,thetas);
        docIndex++;
        if (docIndex%10000==0)
            System.out.print("*"); //progress bar ;)
    }
    s.close();
    long t2 = System.currentTimeMillis();
    System.out.println("\nRead file in "+ (t2-t1) +" ms");
}

Oh!, and HashMapX is an inner class declared like this:

public static class HashMapX< K, V> extends HashMap<K,V> {
    public V get(Object key, V altVal) {
        if (this.containsKey(key))
            return this.get(key);
        else
            return altVal;
    }
}
like image 625
fiacobelli Avatar asked Apr 19 '11 17:04

fiacobelli


1 Answers

This may not address all of your questions, but is a way in which serialization can significantly increase memory usage: http://java.sun.com/javase/technologies/core/basic/serializationFAQ.jsp#OutOfMemoryError.

In short, if you keep an ObjectOutputStream open then none of the objects that have been written to it can be garbage-collected unless you explicitly call its reset() method.

like image 89
Dave Costa Avatar answered Oct 16 '22 21:10

Dave Costa