Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Stream.Read() vs BinaryReader.Read() to process binary streams

Tags:

When working with binary streams (i.e. byte[] arrays), the main point of using BinaryReader or BinaryWriter seems to be simplified reading/writing of primitive data types from a stream, using methods such as ReadBoolean() and taking encoding into account. Is that the whole story? Is there an inherent advantage or disadvantage if one works directly with a Stream, without using BinaryReader/BinaryWriter? Most methods, such as Read(), seem to be the same in both classes, and my guess is that they work identically underneath.

Consider a simple example of processing a binary file in two different ways (edit: I realize this way is ineffective and that a buffer can be used, it's just a sample):

// Using FileStream directly using (FileStream stream = new FileStream("file.dat", FileMode.Open)) {     // Read bytes from stream and interpret them as ints     int value = 0;     while ((value = stream.ReadByte()) != -1)     {         Console.WriteLine(value);     } }   // Using BinaryReader using (BinaryReader reader = new BinaryReader(FileStream fs = new FileStream("file.dat", FileMode.Open))) {     // Read bytes and interpret them as ints     byte value = 0;         while (reader.BaseStream.Position < reader.BaseStream.Length)     {         value = reader.ReadByte();         Console.WriteLine(Convert.ToInt32(value));     } } 

The output will be the same, but what's happening internally (e.g. from OS perspective)? Is it - generally speaking - important which implementation is used? Is there any purpose to using BinaryReader/BinaryWriter if you don't need the extra methods that they provide? For this specific case, MSDN says this in regard to Stream.ReadByte():

The default implementation on Stream creates a new single-byte array and then calls Read. While this is formally correct, it is inefficient.

Using GC.GetTotalMemory(), this first approach does seem to allocate 2x as much space as the second one, but AFAIK this shouldn't be the case if a more general Stream.Read() method is used (e.g. for reading in chunks using a buffer). Still, it seems to me that these methods/interfaces could be unified easily...

like image 591
w128 Avatar asked Jun 11 '13 12:06

w128


People also ask

What is the use of BinaryReader?

The BinaryReader class provides methods that simplify reading primitive data types from a stream. For example, you can use the ReadBoolean method to read the next byte as a Boolean value and advance the current position in the stream by one byte. The class includes read methods that support different data types.

Which object is used to read the byte array and helps in generating binary content?

The BinaryReader class is used to read binary data from a file. A BinaryReader object is created by passing a FileStream object to its constructor.

What are binary streams?

Binary streams contain a sequence of bytes. For binary streams, the library does not translate any characters on input or output. It treats them as a continuous stream of bytes, and ignores any record boundaries. When data is written out to a record-oriented file, it fills one record before it starts filling the next.


2 Answers

No, there is no principal difference between the two approaches. The extra Reader adds some buffering so you shouldn't mix them. But don't expect any significant performance differences, it's all dominated by the actual I/O.

So,

  • use a stream when you have (only) byte[] to move. As is common in a lot of streaming scenarios.
  • use BinaryWriter and BinaryReader when you have any other basic type (including simple byte) of data to process. Their main purpose is conversion of the built-in framework types to byte[].
like image 170
Henk Holterman Avatar answered Nov 14 '22 04:11

Henk Holterman


One big difference is how you can buffer the I/O. If you are writing/reading only a few bytes here or there, BinaryWriter/BinaryReader will work well. But if you have to read MBs of data, then reading one byte, Int32, etc... at a time will be a bit slow. You could instead read larger chunks and parse from there.

Example:

// Using FileStream directly with a buffer using (FileStream stream = new FileStream("file.dat", FileMode.Open)) {     // Read bytes from stream and interpret them as ints     byte[] buffer = new byte[1024];     int count;     // Read from the IO stream fewer times.     while((count = stream.Read(buffer, 0, buffer.Length)) > 0)         for(int i=0; i<count; i++)            Console.WriteLine(Convert.ToInt32(buffer[i])); } 

Now this is a bit off topic... but I'll throw it out there: If you wanted to get VERY crafty... and really give yourself a performance boost... (Albeit, it might be considered dangerous) Instead of parsing EACH Int32, you could do them all at once using Buffer.BlockCopy()

Another example:

// Using FileStream directly with a buffer and BlockCopy using (FileStream stream = new FileStream("file.dat", FileMode.Open)) {     // Read bytes from stream and interpret them as ints     byte[] buffer = new byte[1024];     int[] intArray = new int[buffer.Length >> 2]; // Each int is 4 bytes     int count;     // Read from the IO stream fewer times.     while((count = stream.Read(buffer, 0, buffer.Length)) > 0)     {        // Copy the bytes into the memory space of the Int32 array in one big swoop        Buffer.BlockCopy(buffer, 0, intArray, count);         for(int i=0; i<count; i+=4)           Console.WriteLine(intArray[i]);     } } 

A few things to note about this example: This one takes 4 bytes per Int32 instead of one... So it will yield different results. You can also do this for other data types other than Int32, but many would argue that marshalling should be on your mind. (I just wanted to present something to think about...)

like image 35
poy Avatar answered Nov 14 '22 04:11

poy