Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Force no BOM when saving XML

I'm saving a XML file that is used by an external tool.
The tool sadly doesn't understand the encoding BOM (Byte Order Mark: EF BB BF) at the beginning of the file:

<?xml version="1.0" encoding="UTF-8"?>
...

00000000h: EF BB BF 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E   <?xml version
00000010h: 3D 22 31 2E 30 22 20 65 6E 63 6F 64 69 6E 67 3D   ="1.0" encoding=
00000020h: 22 55 54 46 2D 38 22 3F 3E ...                    "UTF-8"?>

My code:

XmlDocument doc = new XmlDocument();
// Stuff...
using (TextWriter sw = new StreamWriter(file, false, Encoding.UTF8)) {
    doc.Save(sw);
}

Question:
How do I force the TextWriter to NOT write the BOM?

like image 928
joe Avatar asked Jun 12 '14 13:06

joe


People also ask

Is BOM allowed in XML?

An XML document is not required to have a BOM, but if it does it should occur at the beginning of the file. It is used to inform XML parsers(software that reads XML byte by byte, checks for syntax errors and identifies each node and value) what encoding the XML document is written in.

What is BOM in XML file?

The Byte-Order-Mark (or BOM), is a special marker added at the very beginning of an Unicode file encoded in UTF-8, UTF-16 or UTF-32. It is used to indicate whether the file uses the big-endian or little-endian byte order. The BOM is mandatory for UTF-16 and UTF-32, but it is optional for UTF-8.

How do I remove byte order mark?

How to remove BOM. If you want to remove the byte order mark from a source code, you need a text editor that offers the option of saving the mark. You read the file with the BOM into the software, then save it again without the BOM and thereby convert the coding. The mark should then no longer appear.


1 Answers

You can create a UTF8Encoding instance which doesn't use the BOM, instead of using Encoding.UTF8.

using (TextWriter sw = new StreamWriter(file, false, new UTF8Encoding(false))) {
    doc.Save(sw);
}

You can save this in a static field if you're worried about the cost of instantiating it repeatedly:

private static readonly Encoding UTF8NoByteOrderMark = new UTF8Encoding(false);

...

using (TextWriter sw = new StreamWriter(file, false, UTF8NoByteOrderMark)) {
    doc.Save(sw);
}
like image 56
Jon Skeet Avatar answered Sep 21 '22 05:09

Jon Skeet