Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exceeding byte[] array length (over int upper limit) - java.lang.ArrayIndexOutOfBoundsException

I have a ByteArrayOutputStream object that I'm getting the following error for:

java.lang.ArrayIndexOutOfBoundsException at 
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:113)

I am trying to load a file that is several gigs into it by writing byte[] chunks of 250mb one at a time.

I can watch the byte grow in size and as soon as it hits length 2147483647, the upper limit of int, it blows up on the following line:

stream.write(buf); 

stream is the ByteArrayOutputStream, buf is what I'm writing to the stream in 250mb chunks.

I was planning to do

byte result[] = stream.toByteArray();

At the end. Is there some other method I can try that will support byte array sizes greater than the int upper limit?

like image 995
Brian Avatar asked Feb 22 '12 15:02

Brian


3 Answers

Arrays in Java simply can't exceed the bounds of int.

From the JLS section 15.10:

The type of each dimension expression within a DimExpr must be a type that is convertible (§5.1.8) to an integral type, or a compile-time error occurs. Each expression undergoes unary numeric promotion (§). The promoted type must be int, or a compile-time error occurs; this means, specifically, that the type of a dimension expression must not be long.

Likewise in the JVM spec for arraylength:

The arrayref must be of type reference and must refer to an array. It is popped from the operand stack. The length of the array it references is determined. That length is pushed onto the operand stack as an int.

That basically enforces the maximum size of arrays.

It's not really clear what you were going to do with the data after loading it, but I'd attempt not to need to load it all into memory to start with.

like image 183
Jon Skeet Avatar answered Sep 25 '22 02:09

Jon Skeet


Use more than one array. When you reach the limit use ByteArrayOutputStream.toByteArray() and reset with ByteArrayOutputStream.reset().

like image 31
Dev Avatar answered Sep 21 '22 02:09

Dev


Using a ByteArrayOutputStream for writing several GiB of data is not a good idea as everything has to held in the computer's memory. As you have noticed a byte array is limited to 2^31 bytes (2GiB).

Additionally the buffer used for storing that data does not grow if you write more data in it, therefore if the used buffer is getting full a new one has to be created (usually of double size) and all data has to copied from the old buffer into the new one.

My advice would be to use RandomAccessFile and save the data you get to a file. Via RandomAccessFile you can operate on data files larger than 2GiB.

like image 32
Robert Avatar answered Sep 23 '22 02:09

Robert