I am reading a very large file and extracting some small portions of text from each line. However at the end of the operation, I am left with very little memory to work with. It seems that the garbage collector fails to free memory after reading in the file.
My question is: Is there any way to free this memory? Or is this a JVM bug?
I created an SSCCE to demonstrate this. It reads in a 1 mb (2 mb in Java due to 16 bit encoding) file and extracts one character from each line (~4000 lines, so should be about 8 kb). At the end of the test, the full 2 mb is still used!
The initial memory usage:
Allocated: 93847.55 kb
Free: 93357.23 kb
Immediately after reading in the file (before any manual garbage collection):
Allocated: 93847.55 kb
Free: 77613.45 kb (~16mb used)
This is to be expected since the program is using a lot of resources to read in the file.
However then I garbage collect, but not all the memory is freed:
Allocated: 93847.55 kb
Free: 91214.78 kb (~2 mb used! That's the entire file!)
I know that manually calling the garbage collector doesn't give you any guarantees (in some cases it is lazy). However this was happening in my larger application where the file eats up almost all available memory, and causes the rest of the program to run out of memory despite the need for it. This example confirms my suspicion that the excess data read from the file is not freed.
Here is the SSCCE to generate the test:
import java.io.*;
import java.util.*;
public class Test {
public static void main(String[] args) throws Throwable {
Runtime rt = Runtime.getRuntime();
double alloc = rt.totalMemory()/1000.0;
double free = rt.freeMemory()/1000.0;
System.out.printf("Allocated: %.2f kb\nFree: %.2f kb\n\n",alloc,free);
Scanner in = new Scanner(new File("my_file.txt"));
ArrayList<String> al = new ArrayList<String>();
while(in.hasNextLine()) {
String s = in.nextLine();
al.add(s.substring(0,1)); // extracts first 1 character
}
alloc = rt.totalMemory()/1000.0;
free = rt.freeMemory()/1000.0;
System.out.printf("Allocated: %.2f kb\nFree: %.2f kb\n\n",alloc,free);
in.close();
System.gc();
alloc = rt.totalMemory()/1000.0;
free = rt.freeMemory()/1000.0;
System.out.printf("Allocated: %.2f kb\nFree: %.2f kb\n\n",alloc,free);
}
}
Another very fast approach is to have dedicated object pools for different classes of object. Released objects can just be recycled in the pool, using something like a linked list of free object slots. Operating systems often used this kind of approach for common data structures.
Java Memory Management, with its built-in garbage collection, is one of the language's finest achievements. It allows developers to create new objects without worrying explicitly about memory allocation and deallocation, because the garbage collector automatically reclaims memory for reuse.
Garbage collection makes Java memory efficient because it removes the unreferenced objects from heap memory and makes free space for new objects. The Java Virtual Machine has eight types of garbage collectors.
When making a substring, your substring keeps a reference to the char array of the original string (this optimization makes handling many substring of a string very fast). And so, as you keep your substrings in the al
list, you're keeping your whole file in memory. To avoid this, create a new String using the constructor that takes a string as argument.
So basically I'd suggest you do
while(in.hasNextLine()) {
String s = in.nextLine();
al.add(new String(s.substring(0,1))); // extracts first 1 character
}
The source code of the String(String) constructor explicitly states that its usage is to trim "the baggage" :
164 public String(String original) {
165 int size = original.count;
166 char[] originalValue = original.value;
167 char[] v;
168 if (originalValue.length > size) {
169 // The array representing the String is bigger than the new
170 // String itself. Perhaps this constructor is being called
171 // in order to trim the baggage, so make a copy of the array.
172 int off = original.offset;
173 v = Arrays.copyOfRange(originalValue, off, off+size);
174 } else {
175 // The array representing the String is the same
176 // size as the String, so no point in making a copy.
177 v = originalValue;
178 }
179 this.offset = 0;
180 this.count = size;
181 this.value = v;
Update : this problem is gone with OpenJDK 7, Update 6. People with a more recent version don't have the problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With