I have a large text file with 20 million lines of text. When I read the file using the following program, it works just fine, and in fact I can read much larger files with no memory problems.
public static void main(String[] args) throws IOException {
File tempFile = new File("temp.dat");
String tempLine = null;
BufferedReader br = null;
int lineCount = 0;
try {
br = new BufferedReader(new FileReader(tempFile));
while ((tempLine = br.readLine()) != null) {
lineCount += 1;
}
} catch (Exception e) {
System.out.println("br error: " +e.getMessage());
} finally {
br.close();
System.out.println(lineCount + " lines read from file");
}
}
However if I need to append some records to this file before reading it, the BufferedReader consumes a huge amount of memory (I have just used Windows task manager to monitor this, not very scientific I know but it demonstrates the problem). The amended program is below, which is the same as the first one, except I am appending a single record to the file first.
public static void main(String[] args) throws IOException {
File tempFile = new File("temp.dat");
PrintWriter pw = null;
try {
pw = new PrintWriter(new BufferedWriter(new FileWriter(tempFile, true)));
pw.println(" ");
} catch (Exception e) {
System.out.println("pw error: " + e.getMessage());
} finally {
pw.close();
}
String tempLine = null;
BufferedReader br = null;
int lineCount = 0;
try {
br = new BufferedReader(new FileReader(tempFile));
while ((tempLine = br.readLine()) != null) {
lineCount += 1;
}
} catch (Exception e) {
System.out.println("br error: " +e.getMessage());
} finally {
br.close();
System.out.println(lineCount + " lines read from file");
}
}
A screenshot of Windows task manager, where the large bump in the line shows the memory consumption when I run the second version of the program.
So I was able to read this file without running out of memory. But I have much larger files with more than 50 million records, which encounter an out of memory exception when I run this program against them? Can someone explain why the first version of the program works fine on files of any size, but the second program behaves so differently and ends in failure? I am running on Windows 7 with:
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b05)
Java HotSpot(TM) Client VM (build 23.1-b03, mixed mode, sharing)
The best way to view extremely large text files is to use… a text editor. Not just any text editor, but the tools meant for writing code. Such apps can usually handle large files without a hitch and are free. Large Text File Viewer is probably the simplest of these applications.
Use a method from Scanner object - FindWithinHorizon. Scanner will internally make a FileChannel to read the file. And for pattern matching it will end up using a Boyer-Moore algorithm for efficient string searching.
you can start a Java-VM with VM-Options
-XX:+HeapDumpOnOutOfMemoryError
this will write a heap dump to a file, which can be analysed for finding leak suspects
Use a '+' to add an option and a '-' to remove an option.
If you are using Eclipse the Java Memory Analyzer Plugin MAT to get Heap-Dumps from running VMs with some nice analyses for Leak Suspects etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With