Why use buffers to read/write Streams

Tags:

Following reading various questions on reading and writing Streams, all the various answers define something like this as the correct way to do it:

private void CopyStream(Stream input, Stream output)
{
   byte[] buffer = new byte[16 * 1024];
   int read;
   while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
   {
      output.Write(buffer, 0, read);
   } 
}

Two questions:

Why read and write in these smaller chunks?

What is the significance of the buffer size used?

576

asked May 12 '10 11:05

James Hay

1 Answers

If you read a byte at a time, then every byte you call has the overhead of calling the function to read the byte, and additional overheads (for example, doing a fileposition += 1 to remember where in the file you are, checking if you have reached the end of the file, and so on)

If you read 4000 bytes, then you have the same overheads (in the above example, 1 function call, one add (fileposition += 4000), and one check to see if you are at the end of the file. So in terms of the overheads, you've just made it 4000 times faster. (In reality, there are other costs so you won't see that big a gain, but you have drastically cut the overheads)

Of course, you could create a buffer as big as the entire file, and get the absolute minimum overheads. However:

the file might be huge - bigger than the memory available to your program, so this would simply fail. Or it might be so big that you start to use virtual memory and this will drastically slow things down. So breaking it into smaller chunks means you can copy an unlimited amount of data by using a small fixed-size buffer
your OS and devices might be able to read and write data simultaneously (e.g. copying from one physical disk drive to another). If you read all the data before you write all the data, then you have to wait for the whole read before you can start writing. But in many cases, you may be able to be doing both operations in parallel - so read a small chunk and start it writing "asynchronously" (in the background) while you go back and read the next chunk.
You get diminishing returns. Reading 4 bytes instead of 1 may well be 4x faster. But reading 4,000, 40,000 or 400,000 will not speed things up (indeed, for the reasons above, larger buffers could actually slow things down).
In some cases, physical devices work with specific data sizes (e.g. 4096 bytes per sector, 128 bytes per cache line, or 1500 bytes per data packet, or 8 bytes (64 bits) over a CPU bus). Dividing data up into chunks that match (or are multiples of) the underlying transport/storage mechanism can help the hardware to process the data more efficiently.

Typically I/O buffers of between 4kB to 128kB work best for most situations, and you can tune these to the particular operation being performed, so there is no "perfect" size that fits all situations.

Note that in most I/O situations, there are many buffers being used. e.g. When copying data from a disk, (in simplistic terms) it is read from the disk to a read cache (buffer) in the hard drive, then sent over the interface cable to the computer's drive controller, which may also buffer the data. Then it may be transferred into RAM via an I/O buffer, where it is held until your program is ready to receive it (it will probably even be fetching the data before you ask for it, as it expects you to continue reading from the same file, and tries to buffer the data so you don't have to wait for it). Then you read it into your buffer and write it. Then it goes to another I/O buffer, is sent to the drive controller, passed on to the drive, and cached in a write cache. Eventually the hard drive will decide to actually store the data in its write cache, and your copy will be completed - most of this happens in the background, so it may not finish being written until many seconds after your program thinks it has finished writing and has gone on to another task. (This is why you have to "safely remove" USB drives before unplugging them - the OS may not have actually written all the data to the device yet, even many seconds after the computer said your copy operation was finished)

answered Sep 23 '22 18:09

Jason Williams

Related questions
                            
                                CrystalDecisions.CrystalReports.Engine DLL
                            
                                Get List<T> values with late binding
                            
                                Could not load file or assembly log4net
                            
                                Considerations for not awaiting a Task in an asynchronous method
                            
                                Task.Run in Static Initializer
                            
                                What is the purpose of IAsyncStateMachine.SetStateMachine?
                            
                                Solving error "Microsoft.NETCore.App 1.0.0 does not support framework .NETFramework,Version=v4.6.1"
                            
                                Significant drop in performance of Math.Round on x64 platform
                            
                                dotnet aspnetcore docker build fails with a 145 error code
                            
                                Can memory reordering cause C# to access unallocated memory?
                            
                                Difference between a BitmapFrame and BitmapImage in WPF
                            
                                HttpListener.Start() AccessDenied error on Vista
                            
                                Needed: File system interfaces and implementation in .NET [duplicate]
                            
                                How to make a .Net or JVM language?
                            
                                Using Entity Framework as Data Access Layer
                            
                                What are some of the best ways of doing silent updates for a desktop app?
                            
                                System.Exception vs System.SystemException
                            
                                Same source, multiple targets with different resources (Visual Studio .Net 2008)
                            
                                P/Invoke dynamic DLL search path
                            
                                Pros and Cons of automating Excel using VBA vs .Net

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why use buffers to read/write Streams

Tags:

language-agnostic

.net

stream

James Hay

People also ask

1 Answers

Jason Williams

Recent Activity

Donate For Us