Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance : BufferedOutputStream vs FileOutputStream in Java

Tags:

java

I have read that BufferedOutputStream Class improves efficiency and must be used with FileOutputStream in this way -

BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream("myfile.txt"));

and for writing to the same file below statement is also works -

FileOutputStream fout = new FileOutputStream("myfile.txt");

But the recommended way is to use Buffer for reading / writing operations and that's the reason only I too prefer to use Buffer for the same.

But my question is how to measure performance of above 2 statements. Is their any tool or kind of something, don't know exactly what? but which will be useful to analyse it's performance.

As new to JAVA language, I am very curious to know about it.

like image 431
Chaitanya Ghule Avatar asked Apr 20 '17 19:04

Chaitanya Ghule


People also ask

Why should we use BufferedOutputStream with FileOutputStream?

The BufferedOutputStream class of the java.io package is used with other output streams to write the data (in bytes) more efficiently. It extends the OutputStream abstract class.

Is BufferedWriter faster than FileWriter?

BufferedWriter: BufferedWriter is almost similar to FileWriter but it uses internal buffer to write data into File. So if the number of write operations is more, the actual IO operations are less and performance is better. You should use BufferedWriter when the number of write operations is more.

What is the difference between FileWriter and FileOutputStream?

FileWriter vs FileOutputStreamFileWriter is meant for writing streams of characters while FileOutputStream is used for writing streams of raw bytes. FileWriter deal with 16-bit characters while FileOutputStream deals with 8-bit bytes.

What is Bufferedinputstream and BufferedOutputStream in Java?

It creates the new buffered output stream which is used for writing the data to the specified output stream. BufferedOutputStream(OutputStream os, int size) It creates the new buffered output stream which is used for writing the data to the specified output stream with a specified buffer size.


2 Answers

Buffering is only helpful if you are doing inefficient reading or writing. For reading, it's helpful for letting you read line by line, even when you could gobble up bytes / chars faster just using read(byte[]) or read(char[]). For writing, it allows you to buffer pieces of what you want to send through I/O with the buffer, and to send them only on flush (see PrintWriter (PrintOutputStream(?).setAutoFlush())

But if you are just trying to read or write as fast as you can, buffering doesn't improve performance

For an example of efficient reading from a file:

File f = ...;
FileInputStream in = new FileInputStream(f);
byte[] bytes = new byte[(int) f.length()]; // file.length needs to be less than 4 gigs :)
in.read(bytes); // this isn't guaranteed by the API but I've found it works in every situation I've tried

Versus inefficient reading:

File f = ...;
BufferedReader in = new BufferedReader(f);
String line = null;
while ((line = in.readLine()) != null) {
  // If every readline call was reading directly from the FS / Hard drive,
  // it would slow things down tremendously. That's why having a buffer 
  //capture the file contents and effectively reading from the buffer is
  //more efficient
}
like image 71
ControlAltDel Avatar answered Oct 03 '22 00:10

ControlAltDel


These numbers came from a MacBook Pro laptop using an SSD.

  • BufferedFileStreamArrayBatchRead (809716.60-911577.03 bytes/ms)
  • BufferedFileStreamPerByte (136072.94 bytes/ms)
  • FileInputStreamArrayBatchRead (121817.52-1022494.89 bytes/ms)
  • FileInputStreamByteBufferRead (118287.20-1094091.90 bytes/ms)
  • FileInputStreamDirectByteBufferRead (130701.87-956937.80 bytes/ms)
  • FileInputStreamReadPerByte (1155.47 bytes/ms)
  • RandomAccessFileArrayBatchRead (120670.93-786782.06 bytes/ms)
  • RandomAccessFileReadPerByte (1171.73 bytes/ms)

Where there is a range in the numbers, it varies based on the size of the buffer being used. A larger buffer results in more speed up to a point, typically somewhere around the size of the caches within the hardware and operating system.

As you can see, reading bytes individually is always slow. Batching the reads into chunks is easily the way to go. It can be the difference between 1k per ms and 136k per ms (or more).

These numbers are a little old, and they will vary wildly by setup but they will give you an idea. The code for generating the numbers can be found here, edit Main.java to select the tests that you want to run.

An excellent (and more rigorous) framework for writing benchmarks is JMH. A tutorial for learning how to use JMH can be found here.

like image 39
Chris K Avatar answered Oct 03 '22 00:10

Chris K