Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the rationale behind XmlDocument mixed content pretty-printing behavior?

.NET XmlDocument has an interesting behavior when pretty-printing mixed content nodes using XmlDocument.Save(TextWriter).

The behavior can be summarized as "once the pretty printer encounters a text node, it disables indentation and automatic newlines for the rest of the current subtree".

Here's an example (http://ideone.com/b1WxD7):

<?xml version='1.0'?>
<root><test><child1/><child2/>foo<child3><child4/></child3></test></root>

is pretty printed to

<?xml version="1.0"?>
<root>
  <test>
    <child1 />
    <child2 />foo<child3><child4 /></child3></test>
</root>

This behavior does not seem correct nor intuitive. Why does XmlDocument work like that?

like image 316
zeuxcg Avatar asked Oct 19 '22 16:10

zeuxcg


1 Answers

This behavior is unfortunate, but I think it can be explained by the description of the Formatting.Indented option for XmlTextWriter (which is what XmlDocument.Save is using here):

Causes child elements to be indented according to the Indentation and IndentChar settings. This option indents element content only; mixed content is not affected.

The intent of this option is to preserve the formatting of XML like

<p>Here is some <b>bold</b> text.</p>

and not have it reformatted as

<p>
    Here is some 
    <b>
        bold
    </b>
     text.
</p>

But there's a problem: How does XmlTextWriter know an element contains mixed content? Because XmlTextWriter is a non-cached, forward-only writer, the answer is that it doesn't until it actually encounters character data. At that point, it switches to "mixed content" mode and suppresses formatting. Unfortunately, it's too late to undo the formatting of child nodes that have already been written to the underlying stream.

like image 74
Michael Liu Avatar answered Dec 08 '22 21:12

Michael Liu