I have written the following code which writes 4000 bytes of 0s to a file test.txt
. Then, I read the same file in chunks of 1000 bytes at a time.
FileOutputStream output = new FileOutputStream("test.txt"); ObjectOutputStream stream = new ObjectOutputStream(output); byte[] bytes = new byte[4000]; stream.write(bytes); stream.close(); FileInputStream input = new FileInputStream("test.txt"); ObjectInputStream s = new ObjectInputStream(input); byte[] buffer = new byte[1000]; int read = s.read(buffer); while (read > 0) { System.out.println("Read " + read); read = s.read(buffer); } s.close();
What I expect to happen is to read 1000 bytes four times.
Read 1000 Read 1000 Read 1000 Read 1000
However, what actually happens is that I seem to get "paused" (for a lack of a better word) every 1024 bytes.
Read 1000 Read 24 Read 1000 Read 24 Read 1000 Read 24 Read 928
If I try to read more than 1024 bytes, then I get capped at 1024 bytes. If I try to read less than 1024 bytes, I'm still required to pause at the 1024 byte mark.
Upon inspection of the output file test.txt
in hexadecimal, I noticed that there is a sequence of 5 non-zero bytes 7A 00 00 04 00
1029 bytes apart, despite the fact that I have written only 0s to the file. Here is the output from my hex editor. (Would be too long to fit in question.)
So my question is : Why are these five bytes appearing in my file when I have written entirely 0s? Do these 5 bytes have something to do with the pause that occurs every 1024 bytes? Why is this necessary?
You use the buffered input stream to read as many bytes as the bytes[] array size. You consume the bytes read and then move on to reading more bytes from the file. Hence you don't need know the file size in order to read it.
With a BufferedInputStream , the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.
Using the InputStreamReader classInstantiate an InputStreamReader class by passing your InputStream object as parameter. Read the contents of the current stream reader to a character array using the read() method of the InputStreamReader class.
The read () method of ObjectInputStream class is used to read the data and store into an array of bytes. There must be some data in the stream to use this method. This method accepts 3 parameters. The buf is byte array into which data is stored.
The read () method of java.io.ObjectInputStream is used to read a byte of data. Some data must be present to read in inputstream. It returns an integer value indicating the number of bytes read.
Some data must be present to read in inputstream. It returns an integer value indicating the number of bytes read. It returns -1 if the end of the stream is reached without reading a single byte. No parameter is passed.
The object streams use an internal 1024-byte buffer, and write primitive data in chunks of that size, in blocks of the stream headed by Block Data markers, which are, guess what, 0x7A
followed by a 32-bit length word (or 0x77
followed by an 8-bit length word). So you can only ever read a maximum of 1024 bytes.
The real question here is why you're using object streams just to read and write bytes. Use buffered streams. Then the buffering is under your control, and incidentally there's zero space overhead, unlike the object streams which have stream headers and type codes.
NB serialized data is not text and shouldn't be stored in files named .txt.
ObjectOutputStream
and ObjectInputStream
are special streams used for serialization of objects.
But when you do stream.write(bytes);
you are trying to use the ObjectOutputStream
as a regular stream, for writing 4000 bytes, not for writing an array-of-bytes object. When data are written like this to an ObjectOutputStream
they are handled specially.
From the documentation of ObjectOutputStream
:
(emphasis mine.)
Primitive data, excluding serializable fields and externalizable data, is written to the ObjectOutputStream in block-data records. A block data record is composed of a header and data. The block data header consists of a marker and the number of bytes to follow the header. Consecutive primitive data writes are merged into one block-data record. The blocking factor used for a block-data record will be 1024 bytes. Each block-data record will be filled up to 1024 bytes, or be written whenever there is a termination of block-data mode.
I hope from this it is obvious why you are receiving this behaviour.
I would recommend that you either use BufferedOutputStream
instead of ObjectOutputStream
, or, if you really want to use ObjectOutputStream
, then use writeObject()
instead of write()
. The corresponding applies to input.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With