I'm opening an existing XML file with C#, and I replace some nodes in there. All works fine. Just after I save it, I get the following characters at the beginning of the file:
 (EF BB BF in HEX)
The whole first line:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
The rest of the file looks like a normal XML file. The simplified code is here:
XmlDocument doc = new XmlDocument(); doc.Load(xmlSourceFile); XmlNode translation = doc.SelectSingleNode("//trans-unit[@id='127']"); translation.InnerText = "testing"; doc.Save(xmlTranslatedFile);
I'm using a C# Windows Forms application with .NET 4.0.
Any ideas? Why would it do that? Can we disable that somehow? It's for Adobe InCopy, and it does not open it like this.
UPDATE: Alternative Solution:
Saving it with the XmlTextWriter works too:
XmlTextWriter writer = new XmlTextWriter(inCopyFilename, null); doc.Save(writer);
Ï, lowercase ï, is a symbol used in various languages written with the Latin alphabet; it can be read as the letter I with diaeresis or I-umlaut. I with Diaeresis.
To check if BOM character exists, open the file in Notepad++ and look at the bottom right corner. If it says UTF-8-BOM then the file contains BOM character.
Only a < or a whitespace character can begin a well-formed XML document.
It is the UTF-8 BOM, which is actually discouraged by the Unicode standard:
http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature
You may disable it using:
var sw = new IO.StreamWriter(path, new System.Text.UTF8Encoding(false)); doc.Save(sw); sw.Close();
It's a UTF-8 Byte Order Mark (BOM) and is to be expected.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With