Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# What is the difference between Text.Encoder and Text.Encoding

I am currently using Unicode in bytes and using Encoding class to get bytes and get strings.

However, I saw there is an encoder class and it seems like doing the same thing as the encoding class. Does anyone know what is the difference between them and when to use either of them.

Here are the Microsoft documentation page:

Encoder: https://msdn.microsoft.com/en-us/library/system.text.encoder(v=vs.110).aspx

Encoding: https://msdn.microsoft.com/en-us/library/system.text.encoding(v=vs.110).aspx

like image 733
Eds Avatar asked Jun 12 '17 21:06

Eds


1 Answers

There is definitely a difference. An Encoding is an algorithm for transforming a sequence of characters into bytes and vice versa. An Encoder is a stateful object that transforms sequences of characters into bytes. To get an Encoder object you usually call GetEncoder on an Encoding object. Why is it necessary to have a stateful tranformation? Imagine you are trying to efficiently encode long sequences of characters. You want to avoid creating a lot of arrays or one huge array. So you break the characters down into say reusable 1K character buffers. However this might make some illegal characters sequences, for example a utf-16 surrogate pair broken across to separate calls to GetBytes. The Encoder object knows how to handle this and saves the necessary state across successive calls to GetBytes. Thus you use an Encoder for transforming one block of text that is self-contained. I believe you can reuse an Encoder instance more transforms of multiple sections of text as long as you have called GetBytes with flush equal to true on the last array of characters. If you just want to easily encode short strings, use the Encoding.GetBytes methods. For the decoding operations there is a similar Decoder class that holds the decoding state.

like image 129
Mike Zboray Avatar answered Sep 19 '22 23:09

Mike Zboray