Is it more effective to buffer an output stream than an input stream in Java?

Tags:

Being bored earlier today I started thinking a bit about the relative performance of buffered and unbuffered byte streams in Java. As a simple test, I downloaded a reasonably large text file and wrote a short program to determine the effect that buffered streams has when copying the file. Four tests were performed:

Copying the file using unbuffered input and output byte streams.
Copying the file using a buffered input stream and an unbuffered output stream.
Copying the file using an unbuffered input stream and a buffered output stream.
Copying the file using buffered input and output streams.

Unsurprisingly, using buffered input and output streams is orders of magnitude faster than using unbuffered streams. However, the really interesting thing (to me at least) was the difference in speed between cases 2 and 3. Some sample results are as follows:

Unbuffered input, unbuffered output
Time: 36.602513585

Buffered input, unbuffered output
Time: 26.449306847

Unbuffered input, buffered output
Time: 6.673194184

Buffered input, buffered output
Time: 0.069888689

For those interested, the code is available here at Github. Can anyone shed any light on why the times for cases 2 and 3 are so asymmetric?

807

asked Sep 06 '12 20:09

pmcs

2 Answers

When you read a file, the filesystem and devices below it do various levels of caching. They almost never read one byte at at time; they read a block. On a subsequent read of the next byte, the block will be in cache and so will be much faster.

It stands to reason then that if your buffer size is the same size as your block size, buffering the input stream doesn't actually gain you all that much (it saves a few system calls, but in terms of actual physical I/O it doesn't save you too much).

When you write a file, the filesystem can't cache for you because you haven't given it a backlog of things to write. It could potentially buffer the output for you, but it has to make an educated guess at how often to flush the buffer. By buffering the output yourself, you let the device do much more work at once because you manually build up that backlog.

197

answered Oct 06 '22 00:10

Mark Peters

To your title question, it is more effective to buffer the output. The reason for this is the way Hard Disk Drives (HDDs) write data to their sectors. Especially considering fragmented disks. Reading is much faster because the disk already knows where the data is versus having to determine where it will fit. Using the buffer the disk will find larger contiguous blank space to save the data than in the unbuffered manner. Run another test for giggles. Create a new partition on your disk and run your tests reading and writing to the clean slate. To compare apples to apples, format the newly created partition between tests. Please post your numbers after this if you run the tests.

answered Oct 06 '22 00:10

gh.

Related questions
                            
                                Change the thread count of test plan in JMeter, at run time
                            
                                Working with Maven, OSGi and Bndtools
                            
                                How to determine whether two circular sectors overlap with each other
                            
                                List of all system properties supported by a JRE
                            
                                Specification of JKS key store format
                            
                                unchecked exception that would have been better as checked
                            
                                Appengine java - Jersey/Jackson JaxbAnnotationIntrospector NoClassDefFoundError
                            
                                Amazon SNS -> SQS message body
                            
                                Android Scroller simple example
                            
                                java.lang.NoClassDefFoundError: org/springframework/context/EnvironmentAware
                            
                                Right way to test page load time in selenium?
                            
                                Simulate color transparency
                            
                                Why does the Spring Autowire stops working when I add the "RunWith" annotation?
                            
                                Intercepting based on HTTP header in RESTeasy
                            
                                Jenkins Build error java.lang.ClassNotFoundException: hudson.remoting.Launcher
                            
                                Do runnable jars (via Eclipse) contain tracking information?
                            
                                AspectJ: two kinds of tutorials
                            
                                Build sample data for apache commons Fast Fourier Transform algorithm
                            
                                Maven. Put .DLL in the root of JAR
                            
                                Any way to recover Netbeans 7.2 bookmarks navigation old style?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it more effective to buffer an output stream than an input stream in Java?

Tags:

java

stream

pmcs

People also ask

2 Answers

Mark Peters

gh.

Recent Activity

Donate For Us