Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF8 Encoding not adding byte order mark

We know that the constructor of the class UTF8Encoding can receive an optional parameter: a bool specifying if the encoder should provide a byte order mark (BOM) or not.

However, when encoding the same text using both approaches, the output is the same:

string text = "Hello, world!";
byte[] withBom= new UTF8Encoding(true).GetBytes(text);
byte[] withoutBom = new UTF8Encoding(false).GetBytes(text);

Both withBom and withoutBom have the same content, one doesn't even have one byte more than the other one.

Why does this happen? Why is the byte order mark not being added to withBom?

like image 351
Matias Cicero Avatar asked Feb 10 '26 22:02

Matias Cicero


1 Answers

BOM parameter in the constructor does no affect the result of GetBytes, it affects the result of GetPreamble. Users are expected to append it manually.

byte[] bom = new UTF8Encoding(true).GetPreamble(); // 3 bytes
byte[] noBom = new UTF8Encoding(false).GetPreamble(); // 0 bytes
like image 154
Athari Avatar answered Feb 13 '26 16:02

Athari



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!