Why is the performance of BufferedReader so much worse than BufferedInputStream?

Tags:

I understand that using a BufferedReader (wrapping a FileReader) is going to be significantly slower than using a BufferedInputStream (wrapping a FileInputStream), because the raw bytes have to be converted to characters. But I don't understand why it is so much slower! Here are the two code samples that I'm using:

BufferedInputStream inputStream = new BufferedInputStream(new FileInputStream(filename));
try {
  byte[] byteBuffer = new byte[bufferSize];
  int numberOfBytes;
  do {
    numberOfBytes = inputStream.read(byteBuffer, 0, bufferSize);
  } while (numberOfBytes >= 0);
}
finally {
  inputStream.close();
}

and:

BufferedReader reader = new BufferedReader(new FileReader(filename), bufferSize);
try {
  char[] charBuffer = new char[bufferSize];
  int numberOfChars;
  do {
    numberOfChars = reader.read(charBuffer, 0, bufferSize);
  } while (numberOfChars >= 0);
}
finally {
  reader.close();
}

I've tried tests using various buffer sizes, all with a 150 megabyte file. Here are the results (buffer size is in bytes; times are in milliseconds):

Buffer   Input
  Size  Stream  Reader
 4,096    145     497
 8,192    125     465
16,384     95     515
32,768     74     506
65,536     64     531

As can be seen, the fastest time for the BufferedInputStream (64 ms) is seven times faster than the fastest time for the BufferedReader (465 ms). As I stated above, I don't have an issue with a significant difference; but this much difference just seems unreasonable.

My question is: does anyone have a suggestion for how to improve the performance of the BufferedReader, or an alternative mechanism?

792

asked Jan 13 '13 06:01

Andy King

2 Answers

The BufferedReader has convert the bytes into chars. This byte by byte parsing and copy to a larger type is expensive relative to a straight copy of blocks of data.

byte[] bytes = new byte[150 * 1024 * 1024];
Arrays.fill(bytes, (byte) '\n');

for (int i = 0; i < 10; i++) {
    long start = System.nanoTime();
    StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes));
    long time = System.nanoTime() - start;
    System.out.printf("Time to decode %,d MB was %,d ms%n",
            bytes.length / 1024 / 1024, time / 1000000);
}

prints

Time to decode 150 MB was 226 ms
Time to decode 150 MB was 167 ms

NOTE: Having to do this intermixed with system calls can slow down both operations (as system calls can disturb the cache)

113

answered Sep 20 '22 17:09

Peter Lawrey

in BufferedReader implementation there is a fixed constant defaultExpectedLineLength = 80, which is used in readLine method when allocating StringBuffer. If you have big file with lots of lines longer then 80, this fragment might be something that can be improved

if (s == null) 
    s = new StringBuffer(defaultExpectedLineLength);
s.append(cb, startChar, i - startChar);

answered Sep 18 '22 17:09

Jakub C

Related questions
                            
                                jvisualvm difference between live objects and allocated objects
                            
                                Bringing JFileChooser on top of all windows
                            
                                Automated GUI Testing [closed]
                            
                                How can I debug a Doclet in Eclipse?
                            
                                Placing random circles without overlap (and without using brute force)?
                            
                                Objective-C delegates vs Java listeners
                            
                                High level Java security framework
                            
                                JVM Process vs JVM Heap memory usage
                            
                                @Autowired HttpServletResponse
                            
                                What happens when HashMap or HashSet maximum capacity is reached?
                            
                                solution for java.lang.VerifyError in tomcat 7.0.23/jdk 1.6.x?
                            
                                IntelliJ Idea with Git: when automatic merge crashed, how can I continue to merge manually
                            
                                Why does HashMap implement Map if it extends AbstractMap? [duplicate]
                            
                                How Synchronization works in Java?
                            
                                How do you handle "impossible" exceptions in Java?
                            
                                Jersey returns HTTP Status 405 - Method Not Allowed
                            
                                ref vs depends-on attributes in Spring
                            
                                Is it possible to programmatically configure JAXB?
                            
                                How to set Tool Tip on the each Cell of JavaFX Table Mouse-Over?
                            
                                Java equivalent for .charCodeAt()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is the performance of BufferedReader so much worse than BufferedInputStream?

Tags:

java

performance

bufferedreader

bufferedinputstream

Andy King

People also ask

2 Answers

Peter Lawrey

Jakub C

Recent Activity

Donate For Us