This link explains the Encoder.GetBytes Method and there is a bool parameter called flush explained too . The explanation of flush is :
true if this encoder can flush its state at the end of the conversion; otherwise, false. To ensure correct termination of a sequence of blocks of encoded bytes, the last call to GetBytes can specify a value of true for flush.
but I didn't understand what flush does , maybe I am drunk or somthing :). can you explain it in more details please.
Suppose you receive data over a socket connection. You will receive a long text as several byte[]
blocks.
It is possible that 1 Unicode character occupies 2+ bytes in a UTF-8 stream and that it is split over 2 byte blocks. Encoding the 2 byte blocks separately (and concatenating the strings) would produce an error.
So you can only specify flush=true
on the last block. And of course, if you only have 1 block then that is also the last.
Tip: Use a TextReader and let it handle this problem(s) for you.
The mirror problem (that was actually asked: GetBytes) is slightly harder to explain.
Using flush=true
is the same as using Encoder.Reset()
after GetBytes(...)
. It clears the 'state' of the encoder,
including trailing characters at the end of the previous data block, such as an unmatched high surrogate
The basic idea is the same: when converting from string
to blocks of bytes, or vice versa, the blocks are not independent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With