Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java - GC a large string

I have a method to read and parse an extremely long xml file. The xml file is read into a string, which then is parsed by a different class. However, this causes the Java to use a large amount of memory (~500 MB). Normally, the program runs at around 30 MB, but when parse() is called, it increases to 500 MB. When parse() is done running, however, the memory usage doesn't go back down to 30 MB; instead it stays at 500 MB.

I've tried setting s = null and calling System.gc() but the memory usage still stays at 500 MB.

public void parse(){
        try {
            System.out.println("parsing data...");
            String path = dir + "/data.xml";
            InputStream i = new FileInputStream(path);
            BufferedReader reader = new BufferedReader(new InputStreamReader(i));
            String line;
            String s = "";
            while ((line = reader.readLine()) != null){
                s += line + "\n";
            }

            ... parse ...

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
}

Any ideas?

Thanks.

like image 295
Jason Yang Avatar asked May 24 '26 12:05

Jason Yang


1 Answers

Solution for your memory leak

You should Close the BufferReader at the end in order to close the stream and releases any system resources associated with it. You can close both InputStream and BufferReader. However, closing the BufferReader actually closes its stream as well.

Generally it's better to add a finally and close it.

finally 
{
   i.Close();
   reader.Close();
}

Better approach try-with-resources Statement

try (BufferedReader br = new BufferedReader(new FileReader(path))) 
{
        return br.readLine();
}

Bonus Note

Use a StringBuilder instead of concatenating strings

String does not allow appending. Each append/concatenate on a String creates a new object and returns it. This is because String is immutable - it cannot change its internal state.

On the other hand StringBuilder is mutable. When you call Append, it alters the internal char array, rather than creating a new string object.

Thus it is more memory efficient to use a StringBuilder when you want to append many strings.

like image 169
CharithJ Avatar answered May 27 '26 01:05

CharithJ