Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XDocument adds carriage return when generating final xml string

I have a case where I would like to generate xml prior to posting it to an API, containing line breaks (\n) but not carriage returns (no \r).

In C# though, it seems that XDocument automatically adds carriage returns in its to-string method:

var inputXmlString = "<root>Some text without carriage return\nthis is the new line</root>";

// inputXmlString: <root>Some text without carriage return\nthis is the new line</root>

var doc = XDocument.Parse(inputXmlString);

var xmlString = doc.Root.ToString();

// xmlString: <root>Some text without carriage return\n\rthis is the new line</root>

In doc.Root.ToString(), sets of \n\r are added between elements for indentation which does not matter for the receivers interpretation of the xml message as a whole. However, the ToString() method also adds \r inside the actual text field where I need to preserve standalone line breaks (\n without \r after it).

I know I could do a final string replace, removing all carriage returns from the final string prior to the actual HTTP post to be performed, but this just seems wrong.

The issue is the same when constructing the xml-document using XElement objects instead of Document.Parse. The issue also persists, even if I use a CData element to wrap the text.

Can anyone explain to me, if I do something wrong or if there is some clean way of achieving what I try to do?

like image 693
Stephan Møller Avatar asked Sep 29 '16 15:09

Stephan Møller


People also ask

Can an XML have carriage return?

XML does not require a specific form of line break, so you can use whatever is convenient (carriage return, linefeed, or a combination) when creating an XML file.

What is Crlf in XML?

CR and LF are control characters or bytecode that can be used to mark a line break in a text file.


1 Answers

XNode.ToString is a convenience that uses an XmlWriter under the covers - you can see the code in the reference source.

Per the documentation for XmlWriterSettings.NewLineHandling:

The Replace setting tells the XmlWriter to replace new line characters with \r\n, which is the new line format used by the Microsoft Windows operating system. This helps to ensure that the file can be correctly displayed by the Notepad or Microsoft Word applications. This setting also replaces new lines in attributes with character entities to preserve the characters. This is the default value.

So this is why you're seeing this when you convert your element back to a string. If you want to change this behaviour, you'll have to create your own XmlWriter with your own XmlWriterSettings:

var settings = new XmlWriterSettings
{
    OmitXmlDeclaration = true,        
    NewLineHandling =  NewLineHandling.None
};

string xmlString;

using (var sw = new StringWriter())
{
    using (var xw = XmlWriter.Create(sw, settings))
    {
        doc.Root.WriteTo(xw);                    
    }
    xmlString = sw.ToString();
}
like image 180
Charles Mager Avatar answered Nov 14 '22 23:11

Charles Mager