Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java: Read from InputStream doesn't always read the same amount of data

For good or bad I have been using code like the following without any problems:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream(name);  
int theSize = zipInput.available();  
byte[] content = new byte[theSize];  
zipInput.read(content, 0, theSize);

I have used it (this logic of obtaining the available size and reading directly to a byte buffer) for File I/O without any issues and I used it with zip files as well.

But recently I stepped into a case that the zipInput.read(content, 0, theSize); actually reads 3 bytes less that the theSize available.

And since the code is not in a loop to check the length returned by zipInput.read(content, 0, theSize); I read the file with the 3 last bytes missing
and later the program can not function properly (the file is a binary file).

Strange enough with different zip files of larger size e.g. 1075 bytes (in my case the problematic zip entry is 867 bytes) the code works fine!

I understand that the logic of the code is probably not the "best" but why am I suddenly getting this problem now?

And how come if I run the program immediately with a larger zip entry it works?

Any input is highly welcome

Thanks

like image 741
Cratylus Avatar asked Oct 24 '11 11:10

Cratylus


2 Answers

From the InputStream read API docs:

An attempt is made to read as many as len bytes, but a smaller number may be read.

... and:

Returns: the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.

In other words unless the read method returns -1 there is still more data available to read, but you cannot guarantee that read will read exactly the specified number of bytes. The specified number of bytes is the upper bound describing the maximum amount of data it will read.

like image 116
Adamski Avatar answered Sep 17 '22 01:09

Adamski


Using available() does not guarantee that it counted total available bytes to the end of stream.
Refer to Java InputStream's available() method. It says that

Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.

Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

An example solution for your problem can be as follows:

ZipFile aZipFile = new ZipFile(fileName);   
InputStream zipInput = aZipFile.getInputStream( caImport );  
int available = zipInput.available();  
byte[] contentBytes = new byte[ available ];  
while ( available != 0 )   
{   
    zipInput.read( contentBytes );   
    // here, do what ever you want  
    available = dis.available();  
} // while available  
...   

This works for sure on all sizes of input files.

like image 24
Ravinder Reddy Avatar answered Sep 18 '22 01:09

Ravinder Reddy