Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java OutOfMemoryError in reading a large text file

Tags:

I'm new to Java and working on reading very large files, need some help to understand the problem and solve it. We have got some legacy code which have to be optimized to make it run properly.The file size can vary from 10mb to 10gb only. only trouble start when file starting beyond 800mb size.

InputStream inFileReader = channelSFtp.get(path); // file reading from ssh.
byte[] localbuffer = new byte[2048];
ByteArrayOutputStream bArrStream = new ByteArrayOutputStream();

int i = 0;
while (-1 != (i = inFileReader.read(buffer))) {
bArrStream.write(localbuffer, 0, i);
}

byte[] data = bArrStream.toByteArray();
inFileReader.close();
bos.close();

We are getting the error

java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2271)
    at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
    at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
    at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)

Any help would be appreciated?

like image 957
A.P.S Avatar asked Aug 29 '13 10:08

A.P.S


People also ask

How do I fix OutOfMemoryError in Java?

OutOfMemoryError: Metaspace error is thrown. To mitigate the issue, you can increase the size of the Metaspace by adding the -XX:MaxMetaspaceSize flag to startup parameters of your Java application. For example, to set the Metaspace region size to 128M, you would add the following parameter: -XX:MaxMetaspaceSize=128m .

What causes Java Lang OutOfMemoryError?

lang. OutOfMemoryError exception. Usually, this error is thrown when there is insufficient space to allocate an object in the Java heap. In this case, The garbage collector cannot make space available to accommodate a new object, and the heap cannot be expanded further.


2 Answers

Try to use java.nio.MappedByteBuffer.

http://docs.oracle.com/javase/7/docs/api/java/nio/MappedByteBuffer.html

You can map a file's content onto memory without copying it manually. High-level Operating Systems offer memory-mapping and Java has API to utilize the feature.

If my understanding is correct, memory-mapping does not load a file's entire content onto memory (meaning "loaded and unloaded partially as necessary"), so I guess a 10GB file won't eat up your memory.

like image 137
Takahiko Kawasaki Avatar answered Oct 26 '22 11:10

Takahiko Kawasaki


Even though you can increase the JVM memory limit, it is needless and allocating a huge memory like 10GB to process a file sounds overkill and resource intensive.

Currently you are using a "ByteArrayOutputStream" which keeps an internal memory to keep the data. This line in your code keeps appending the last read 2KB file chunk to the end of this buffer:

bArrStream.write(localbuffer, 0, i);

bArrStream keeps growing and eventually you run out of memory.

Instead you should reorganize your algorithm and process the file in a streaming way:

InputStream inFileReader = channelSFtp.get(path); // file reading from ssh.
byte[] localbuffer = new byte[2048];

int i = 0;
while (-1 != (i = inFileReader.read(buffer))) {
    //Deal with the current read 2KB file chunk here
}

inFileReader.close();
like image 20
ttekin Avatar answered Oct 26 '22 09:10

ttekin