Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting "" at the beginning of my XML File after save() [duplicate]

I'm opening an existing XML file with C#, and I replace some nodes in there. All works fine. Just after I save it, I get the following characters at the beginning of the file:

  (EF BB BF in HEX) 

The whole first line:

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> 

The rest of the file looks like a normal XML file. The simplified code is here:

XmlDocument doc = new XmlDocument(); doc.Load(xmlSourceFile); XmlNode translation = doc.SelectSingleNode("//trans-unit[@id='127']"); translation.InnerText = "testing"; doc.Save(xmlTranslatedFile); 

I'm using a C# Windows Forms application with .NET 4.0.

Any ideas? Why would it do that? Can we disable that somehow? It's for Adobe InCopy, and it does not open it like this.

UPDATE: Alternative Solution:

Saving it with the XmlTextWriter works too:

XmlTextWriter writer = new XmlTextWriter(inCopyFilename, null); doc.Save(writer); 
like image 932
Remy Avatar asked Jan 06 '11 11:01

Remy


People also ask

What is this ï?

Ï, lowercase ï, is a symbol used in various languages written with the Latin alphabet; it can be read as the letter I with diaeresis or I-umlaut. I with Diaeresis.

How to check if XML has BOM?

To check if BOM character exists, open the file in Notepad++ and look at the bottom right corner. If it says UTF-8-BOM then the file contains BOM character.

Does XML always start with?

Only a < or a whitespace character can begin a well-formed XML document.


2 Answers

It is the UTF-8 BOM, which is actually discouraged by the Unicode standard:

http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf

Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature

You may disable it using:

var sw = new IO.StreamWriter(path, new System.Text.UTF8Encoding(false)); doc.Save(sw); sw.Close(); 
like image 64
dalle Avatar answered Oct 02 '22 07:10

dalle


It's a UTF-8 Byte Order Mark (BOM) and is to be expected.

like image 30
David Heffernan Avatar answered Oct 02 '22 07:10

David Heffernan