Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is wrong with FileInputStream.read(byte[])?

In response to my answer to a file-reading question, a commenter stated that FileInputStream.read(byte[]) is "not guaranteed to fill the buffer."

File file = /* ... */  
long len = file.length();
byte[] buffer = new byte[(int)len];
FileInputStream in = new FileInputStream(file);
in.read(buffer);

(The code assumes that the file length does not exceed 2GB)

Apart from an IOException, what could cause the read method to not retrieve the entire file contents?

EDIT:

The idea of the code (and the goal of the OP of the question I answered) is to read the entire file into a chunk of memory in one swoop, that's why buffer_size = file_size.

like image 844
Tony the Pony Avatar asked May 25 '11 13:05

Tony the Pony


People also ask

What does the read () method of FileInputStream return?

The read() method of a FileInputStream returns an int which contains the byte value of the byte read.

What is the difference between FileInputStream and Fileoutputstream?

InputStream − This is used to read (sequential) data from a source. OutputStream − This is used to write data to a destination.

What is the difference between FileReader and FileInputStream?

FileInputStream is Byte Based, it can be used to read bytes. FileReader is Character Based, it can be used to read characters. FileInputStream is used for reading binary files. FileReader is used for reading text files in platform default encoding.

What is FileInputStream read in Java?

Java FileInputStream class obtains input bytes from a file. It is used for reading byte-oriented data (streams of raw bytes) such as image data, audio, video etc. You can also read character-stream data. But, for reading streams of characters, it is recommended to use FileReader class.


5 Answers

Apart from an IOException, what could cause the read method to not retrieve the entire file contents?

In my own API implementation, and on my home rolled file-system I simply choose to fill half the buffer...... just kidding.

My point is that even if I wasn't kidding, technically speaking it wouldn't be a bug. It is a matter of method contract. This is the contract (documentation) in this case is:

Reads up to b.length bytes of data from this input stream into an array of bytes.

i.e., it gives no guarantees for filling the buffer.

Depending on the API implementation, and perhaps on the file-system the read method may choose not to fill the buffer. It's basically a question of what the contract of the method says.


Bottom line: It probably works, but is not guaranteed to work.

like image 71
aioobe Avatar answered Oct 10 '22 12:10

aioobe


what could cause the read method to not retrieve the entire file contents?

If, for example, the file is fragmented on the filesystem and the low-level implementation knows that it will have to wait for the HD to seek to the next fragment (which is something that takes a LOT of time relative to CPU operations), it would make sense for the read() call to return with part of the buffer unfilled to give the application the chance to already do something with the data it has recieved.

Now I don't know whether any implementation actually works like that, but the point is that you must not rely on the buffer being filled, because it's not guaranteed by the API contract.

like image 34
Michael Borgwardt Avatar answered Oct 10 '22 14:10

Michael Borgwardt


Well, first off you've made yourself a false dichotomy. One perfectly normal circumstance is that the buffer won't be filled because there aren't that many bytes left in the file. That is not an IOException, but it doesn't mean the whole file's contents have not been read.

The spec says the method will either return -1 indicating end-of-stream or will block until at least one byte is read. Implementers of InputStream can optimize as they see fit (e.g. a TCP stream might return data as soon as the packet comes in regardless of the caller's choice of buffer size). A FileInputStream might fill the buffer with one block's worth of data. As the caller, you have no idea except that until the method returns -1, you need to keep on reading.

Edit

In practice, with your example, the only circumstance I would see where the buffer wouldn't be filled (with a standard implementation) is if the file changed size after you allocated the buffer but before you started reading it. Since you haven't locked the file down this is a possibility.

like image 44
Mark Peters Avatar answered Oct 10 '22 13:10

Mark Peters


People have talked about read on a FileInputStream as hypothetically not filling the buffer. In fact it is a reality in some circumstances:

  • If you open a FileInputStream on a "/dev/tty" or a named pipe, then a read will only return you the data that is currently available. Other device files may behave the same way. (These files will probably return 0L as the file size though.)

  • A FUSE file system can be implemented to not completely fill the read buffer if the file system has been mounted with the direct_io option, or a file is opened with the corresponding flag.

The above apply to Linux, but there could well be similar cases for other operating systems and/or Java implementations. The bottom line is that the javadocs allow this behavior and you can get into trouble if your application assumes that it won't occur.

There are 3rd party libraries that implement "read fully" behavior; e.g. Apache commons provides FileUtils.readFileToByteArray or IOUtils.toByteArray and similar methods. If you want / need that behavior you should use one of those libraries, or implement it yourself.

like image 24
Stephen C Avatar answered Oct 10 '22 14:10

Stephen C


It's not guaranteed to Fill the buffer.

The file size may be smaller than the buffer, or the remainder of the file may be smaller than the buffer.

like image 22
Yochai Timmer Avatar answered Oct 10 '22 13:10

Yochai Timmer