Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java - Using DataInputStream with Sockets, buffered or not?

I'm writing a simple client/server application and I found that using DataInputStream to read data was very convenient because it allows you to chose what to read (without having to convert it yourself from bytes), but I'm wondering if it would be best to wrap it in a BufferedInputStream too, or if that would just add unnecessary overhead?

The reason I'm asking is because I don't know how expensive it is to read directly from the socket stream (when using a BufferedInputStream it will just read once from the socket stream and then multiply times from the BufferedInputStream using DataInputStream).

The data received is usually pretty small, around 20-25 Bytes.

Thanks in advance for any answer! :D

like image 624
Anton Avatar asked Nov 05 '10 22:11

Anton


2 Answers

A DataInputStream is not buffered, so each read operation on a DataInputStream object is going to result in one or more reads on the underlying socket stream, and that could result in multiple system calls (or the equivalent).

A system call is typically 2 to 3 orders of magnitude more expensive than a regular method call. Buffered streams work by reducing the number of system calls (ideally to 1), at the cost of adding an extra layer of regular method calls. Typically using a buffered stream replaces N syscalls with 1 syscall and N extra method calls. If N is greater than 1, you win.

It follows that the only cases where putting a BufferedInputStream between the socket stream and the DataInputStream is not a win are:

  • when the application only makes one read...() call and that can be satisfied by a single syscall,
  • when the application only does large read(byte[] ...) calls, or
  • when the application doesn't read anything.

It sounds like these don't apply in your case.

Besides, even if they do apply, the overhead of using a BufferedInputStream when you don't need to is relatively small. The overhead of not using a BufferedInputStream when you do need to can be huge.

One final point, the actual amount of data read (i.e. the size of the messages) is pretty much irrelevant to the buffered versus unbuffered conundrum. What really matters is the way that data is read; i.e. the sequence of read...() calls that your application will make.

like image 50
Stephen C Avatar answered Oct 21 '22 11:10

Stephen C


The general wisdom is that individual reads on the underlying stream are very slow so buffering almost always is faster. However, for such small numbers (20-25 bytes) it might be that the cost of allocating the buffer is similar to the cost of making those individual reads (once you consider memory allocation and garbage collection). Unfortunately, the only way to find out is to test it and see.

You say that the data received is usually small: how often do you expect larger messages? That will be a significant bottleneck if you receive occasional large messages on an unbuffered stream.

I'd suggest that you run some timing tests and see if buffering makes a difference in your case. Or, don't bother with timing tests and just use a buffer. If the message size changes in the future then this will reduce maintenance later on.

like image 27
Cameron Skinner Avatar answered Oct 21 '22 09:10

Cameron Skinner