Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can I only read 1024 bytes at a time with ObjectInputStream?

Tags:

I have written the following code which writes 4000 bytes of 0s to a file test.txt. Then, I read the same file in chunks of 1000 bytes at a time.

FileOutputStream output = new FileOutputStream("test.txt"); ObjectOutputStream stream = new ObjectOutputStream(output);  byte[] bytes = new byte[4000];  stream.write(bytes); stream.close();  FileInputStream input = new FileInputStream("test.txt"); ObjectInputStream s = new ObjectInputStream(input);   byte[] buffer = new byte[1000]; int read = s.read(buffer);  while (read > 0) {     System.out.println("Read " + read);     read = s.read(buffer); }  s.close(); 

What I expect to happen is to read 1000 bytes four times.

Read 1000 Read 1000 Read 1000 Read 1000 

However, what actually happens is that I seem to get "paused" (for a lack of a better word) every 1024 bytes.

Read 1000 Read 24 Read 1000 Read 24 Read 1000 Read 24 Read 928 

If I try to read more than 1024 bytes, then I get capped at 1024 bytes. If I try to read less than 1024 bytes, I'm still required to pause at the 1024 byte mark.

Upon inspection of the output file test.txt in hexadecimal, I noticed that there is a sequence of 5 non-zero bytes 7A 00 00 04 00 1029 bytes apart, despite the fact that I have written only 0s to the file. Here is the output from my hex editor. (Would be too long to fit in question.)

So my question is : Why are these five bytes appearing in my file when I have written entirely 0s? Do these 5 bytes have something to do with the pause that occurs every 1024 bytes? Why is this necessary?

like image 237
Zsw Avatar asked Nov 29 '15 06:11

Zsw


People also ask

How do I decide how many bytes to read from an InputStream?

You use the buffered input stream to read as many bytes as the bytes[] array size. You consume the bytes read and then move on to reading more bytes from the file. Hence you don't need know the file size in order to read it.

Why is BufferedInputStream fast?

With a BufferedInputStream , the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.

How to pass InputStream In Java?

Using the InputStreamReader classInstantiate an InputStreamReader class by passing your InputStream object as parameter. Read the contents of the current stream reader to a character array using the read() method of the InputStreamReader class.

How to read data from objectinputstream in Java?

The read () method of ObjectInputStream class is used to read the data and store into an array of bytes. There must be some data in the stream to use this method. This method accepts 3 parameters. The buf is byte array into which data is stored.

How to read a byte of data from an InputStream?

The read () method of java.io.ObjectInputStream is used to read a byte of data. Some data must be present to read in inputstream. It returns an integer value indicating the number of bytes read.

What is the return value of read in InputStream?

Some data must be present to read in inputstream. It returns an integer value indicating the number of bytes read. It returns -1 if the end of the stream is reached without reading a single byte. No parameter is passed.


2 Answers

The object streams use an internal 1024-byte buffer, and write primitive data in chunks of that size, in blocks of the stream headed by Block Data markers, which are, guess what, 0x7A followed by a 32-bit length word (or 0x77 followed by an 8-bit length word). So you can only ever read a maximum of 1024 bytes.

The real question here is why you're using object streams just to read and write bytes. Use buffered streams. Then the buffering is under your control, and incidentally there's zero space overhead, unlike the object streams which have stream headers and type codes.

NB serialized data is not text and shouldn't be stored in files named .txt.

like image 117
user207421 Avatar answered Nov 11 '22 20:11

user207421


ObjectOutputStream and ObjectInputStream are special streams used for serialization of objects.

But when you do stream.write(bytes); you are trying to use the ObjectOutputStream as a regular stream, for writing 4000 bytes, not for writing an array-of-bytes object. When data are written like this to an ObjectOutputStream they are handled specially.

From the documentation of ObjectOutputStream:

(emphasis mine.)

Primitive data, excluding serializable fields and externalizable data, is written to the ObjectOutputStream in block-data records. A block data record is composed of a header and data. The block data header consists of a marker and the number of bytes to follow the header. Consecutive primitive data writes are merged into one block-data record. The blocking factor used for a block-data record will be 1024 bytes. Each block-data record will be filled up to 1024 bytes, or be written whenever there is a termination of block-data mode.

I hope from this it is obvious why you are receiving this behaviour.

I would recommend that you either use BufferedOutputStream instead of ObjectOutputStream, or, if you really want to use ObjectOutputStream, then use writeObject() instead of write(). The corresponding applies to input.

like image 40
Mike Nakis Avatar answered Nov 11 '22 20:11

Mike Nakis