Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use StreamReader.ReadBlock()?

Tags:

c#

readblock

I would like to know of a situation Read(char[],int,int) fails to return all chars requested while ReadBlock() returns all chars as expected (say when StreamReader works with an instance of a FileStream object).

like image 1000
RanC Avatar asked Sep 28 '10 07:09

RanC


1 Answers

In practice, when using a StreamReader it is only likely to happen with a stream which may delay for some time - e.g. a network stream as people have mentioned here.

However, with a TextReader generally, you can expect it to happen at any time (including possibly with a future version of .NET in a case where it doesn't currently happen - that it doesn't happen with StreamReader backed on FileStream isn't documented, so there's no guarantee it won't happen in the future).

In particular there are a lot of cases where it's easier (with a knock on effect of simpler, more reliable and probably more efficient code) for the implementer to not return the requested amount if they can partially fulfil the call with just emptying the current buffer, or by using the amount requested as the amount to pass to a backing source (a stream or another TextReader) before doing an operation that could only return a lower number of characters.

Now, to answer the actual question as to "When to use StreamReader.ReadBlock()?" (or more generally, when to use TextReader.ReadBlock()). The following should be borne in mind:

  1. Both Read() and ReadBlock() are guaranteed to return at least one character unless the entire source has been read. Neither will return 0 if there is pending content.

  2. Calling ReadBlock() when Read() will do is wasteful, as it loops needlessly.

  3. But on the other hand it's not that wasteful.

  4. But on the third hand, the cases where Read() will return fewer than requested characters are often cases where another thread is engaged in getting the content that will fill the buffer for the next call, or where that content doesn't exist yet (e.g. user input or a pending operation on another machine) - there's better overall concurrency to be found in processing the partial result and then calling Read() again when that's finished.

So. If you can do something useful with a partial result, then call Read() and work on what you get. In particular if you are looping through and working on the result of each Read() then do this rather than with ReadBlock().

A notable case is if you are building your own TextReader that is backed by another. There's no point calling ReadBlock() unless the algorithm really needs a certain number of characters to work - just return as much as you can from a call to Read() and let the calling code call ReadBlock() if it needs to.

In particular, note that the following code:

char buffer = char[4096];
int len = 0;
while((len = tr.ReadBlock(buffer, 0 , 4096)) != 0)
  DoSomething(buffer, 0, len);

Can be rewritten as:

char buffer = char[4096];
for(int len = tr.Read(buffer, 0, 4096); len != 0; len = tr.Read(buffer, 0, 4096))
  DoSomething(buffer, 0, len);

It may call DoSomething() with smaller sizes sometimes, but it can also have better concurrency if there's another thread involved in providing the data for the next call to Read().

However, the gain is not major in most cases. If you really need a certain number of characters, then do call ReadBlock(). Most importantly in those cases where Read() will have the same result as ReadBlock() the overhead of ReadBlock() checking that it has done so is very slight. Don't try to second-guess whether Read() is safe or not in a given case; if it needs the guarantee of ReadBlock() then use ReadBlock().

like image 140
Jon Hanna Avatar answered Sep 17 '22 23:09

Jon Hanna