Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why we use flush parameter with Encoder.GetBytes method

This link explains the Encoder.GetBytes Method and there is a bool parameter called flush explained too . The explanation of flush is :

true if this encoder can flush its state at the end of the conversion; otherwise, false. To ensure correct termination of a sequence of blocks of encoded bytes, the last call to GetBytes can specify a value of true for flush.

but I didn't understand what flush does , maybe I am drunk or somthing :). can you explain it in more details please.

like image 958
Mohamad Alhamoud Avatar asked Oct 04 '10 18:10

Mohamad Alhamoud


1 Answers

Suppose you receive data over a socket connection. You will receive a long text as several byte[] blocks.

It is possible that 1 Unicode character occupies 2+ bytes in a UTF-8 stream and that it is split over 2 byte blocks. Encoding the 2 byte blocks separately (and concatenating the strings) would produce an error.

So you can only specify flush=true on the last block. And of course, if you only have 1 block then that is also the last.

Tip: Use a TextReader and let it handle this problem(s) for you.

Edit

The mirror problem (that was actually asked: GetBytes) is slightly harder to explain.

Using flush=true is the same as using Encoder.Reset() after GetBytes(...). It clears the 'state' of the encoder,

including trailing characters at the end of the previous data block, such as an unmatched high surrogate

The basic idea is the same: when converting from string to blocks of bytes, or vice versa, the blocks are not independent.

like image 170
Henk Holterman Avatar answered Sep 30 '22 11:09

Henk Holterman