I have a function that unpacks a byte array Z
that was packaged using the zlib library (adapted from here).
I get the cryptic error
'MATLAB array exceeds an internal Java limit.'
at the 2nd line of this code:
import com.mathworks.mlwidgets.io.InterruptibleStreamCopier
a=java.io.ByteArrayInputStream(Z);
b=java.util.zip.GZIPInputStream(a);
isc = InterruptibleStreamCopier.getInterruptibleStreamCopier;
c = java.io.ByteArrayOutputStream;
isc.copyStream(b,c);
M=typecast(c.toByteArray,'uint8');
Z=reshape(Z,[],8);
import com.mathworks.mlwidgets.io.InterruptibleStreamCopier
a=java.io.ByteArrayInputStream(Z(:,1));
b=java.util.zip.GZIPInputStream(a);
for ct = 2:8,b.read(Z(:,ct));end
isc = InterruptibleStreamCopier.getInterruptibleStreamCopier;
c = java.io.ByteArrayOutputStream;
isc.copyStream(b,c);
But at this isc.copystream
I get this error:
Java exception occurred:
java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(Unknown Source)
at java.util.zip.InflaterInputStream.read(Unknown Source)
at java.util.zip.GZIPInputStream.read(Unknown Source)
at java.io.FilterInputStream.read(Unknown Source)
at com.mathworks.mlwidgets.io.InterruptibleStreamCopier.copyStream(InterruptibleStreamCopier.java:72)
at com.mathworks.mlwidgets.io.InterruptibleStreamCopier.copyStream(InterruptibleStreamCopier.java:51)
Reading directly from file I tried to read the data directly from a file.
streamCopier = com.mathworks.mlwidgets.io.InterruptibleStreamCopier.getInterruptibleStreamCopier;
fileInStream = java.io.FileInputStream(java.io.File(filename));
fileInStream.skip(datastart);
gzipInStream = java.util.zip.GZIPInputStream( fileInStream );
baos = java.io.ByteArrayOutputStream;
streamCopier.copyStream(gzipInStream,baos);
data = baos.toByteArray;
baos.close;
gzipInStream.close;
fileInStream.close;
Works fine for small files, but with big files I get:
Java exception occurred:
java.lang.OutOfMemoryError
at the line streamCopier.copyStream(gzipInStream,baos);
The bottleneck seems to be the size of each individual Java object being created. This happens in java.io.ByteArrayInputStream(Z)
since a MATLAB array cannot be fed into Java w/o conversion, and also in copyStream
, where data is actually copied into the the output buffer/memory. I had a similar idea for splitting objects into chunks of allowable size (src):
function chunkDunzip(Z)
%% Imports:
import com.mathworks.mlwidgets.io.InterruptibleStreamCopier
%% Definitions:
MAX_CHUNK = 100*1024*1024; % 100 MB, just an example
%% Split to chunks:
nChunks = ceil(numel(Z)/MAX_CHUNK);
chunkBounds = round(linspace(0, numel(Z), max(2,nChunks)) );
V = java.util.Vector();
for indC = 1:numel(chunkBounds)-1
V.add(java.io.ByteArrayInputStream(Z(chunkBounds(indC)+1:chunkBounds(indC+1))));
end
S = java.io.SequenceInputStream(V.elements);
b = java.util.zip.InflaterInputStream(S);
isc = InterruptibleStreamCopier.getInterruptibleStreamCopier;
c = java.io.FileOutputStream(java.io.File('D:\outFile.bin'));
isc.copyStream(b,c);
c.close();
end
Several notes:
FileOutputStream
since it doesn't run into the internal limit of Java objects (as far as my tests went).If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With