Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it more effective to buffer an output stream than an input stream in Java?

Tags:

java

stream

Being bored earlier today I started thinking a bit about the relative performance of buffered and unbuffered byte streams in Java. As a simple test, I downloaded a reasonably large text file and wrote a short program to determine the effect that buffered streams has when copying the file. Four tests were performed:

  1. Copying the file using unbuffered input and output byte streams.
  2. Copying the file using a buffered input stream and an unbuffered output stream.
  3. Copying the file using an unbuffered input stream and a buffered output stream.
  4. Copying the file using buffered input and output streams.

Unsurprisingly, using buffered input and output streams is orders of magnitude faster than using unbuffered streams. However, the really interesting thing (to me at least) was the difference in speed between cases 2 and 3. Some sample results are as follows:

Unbuffered input, unbuffered output
Time: 36.602513585

Buffered input, unbuffered output
Time: 26.449306847

Unbuffered input, buffered output
Time: 6.673194184

Buffered input, buffered output
Time: 0.069888689

For those interested, the code is available here at Github. Can anyone shed any light on why the times for cases 2 and 3 are so asymmetric?

like image 807
pmcs Avatar asked Sep 06 '12 20:09

pmcs


People also ask

Are buffered streams more efficient?

Buffered streams are typically more efficient than similar nonbuffered streams. It forms the bridge between byte streams and character streams. An InputStreamReader reads bytes from an InputStream and converts them to characters using either the default character-encoding or a character-encoding specified by name.

Why would we use a stream that buffers input or output?

Java BufferedOutputStream class is used for buffering an output stream. It internally uses buffer to store data. It adds more efficiency than to write data directly into a stream. So, it makes the performance fast.

What happens when we use a buffered stream instead of a normal stream?

Internally a buffer array is used and instead of reading bytes individually from the underlying input stream enough bytes are read to fill the buffer. This generally results in faster performance as less reads are required on the underlying input stream.

What are the advantages of using buffered streams?

Stream buffers are critical to clean water: they prevent pollution from entering waterways and stabilize stream banks. Also vital to wildlife, they provide critters with food and habitat and shade streams to the benefit of sensitive aquatic species.


2 Answers

When you read a file, the filesystem and devices below it do various levels of caching. They almost never read one byte at at time; they read a block. On a subsequent read of the next byte, the block will be in cache and so will be much faster.

It stands to reason then that if your buffer size is the same size as your block size, buffering the input stream doesn't actually gain you all that much (it saves a few system calls, but in terms of actual physical I/O it doesn't save you too much).

When you write a file, the filesystem can't cache for you because you haven't given it a backlog of things to write. It could potentially buffer the output for you, but it has to make an educated guess at how often to flush the buffer. By buffering the output yourself, you let the device do much more work at once because you manually build up that backlog.

like image 197
Mark Peters Avatar answered Oct 06 '22 00:10

Mark Peters


To your title question, it is more effective to buffer the output. The reason for this is the way Hard Disk Drives (HDDs) write data to their sectors. Especially considering fragmented disks. Reading is much faster because the disk already knows where the data is versus having to determine where it will fit. Using the buffer the disk will find larger contiguous blank space to save the data than in the unbuffered manner. Run another test for giggles. Create a new partition on your disk and run your tests reading and writing to the clean slate. To compare apples to apples, format the newly created partition between tests. Please post your numbers after this if you run the tests.

like image 32
gh. Avatar answered Oct 06 '22 00:10

gh.