Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need to replace double/single quote in XML body text?

Tags:

xml

Please correct my terminology here if it's off:

The 5 character substitutions for XML are:

  • & ( & )
  • &lt; ( < )
  • &gt; ( > )
  • &quot; ( " )
  • &apos; ( ' )

Do all of these substitutions need to happen in a element text? Or only attribute text? (terminology correction?)

e.g. is this valid XML?

<myelement>x && y</myelement>
<myelement>And I quote, "no"</myelement>

&gt; and &lt; seem obvious to replace in this context, but I'm not clear if the replacement rules are global for the entire XML document, or if they apply differently to different parts of the document (example, cdata sections apply different rules).

Assumption: this is invalid XML:

<myelement field="no & allowed here"/>
<myelement field="no <> allowed here"/>

Quotes are obvious delimiters of attributes, and <> are obvious delimiters of element text.

like image 439
David Parks Avatar asked Jun 02 '14 19:06

David Parks


1 Answers

In element content you only need to escape & and <; you never need to escape single or double quotes, and you need to escape > only if it appears as part of the sequence ]]> (many people replace it unconditionally, because that's simpler).

In attribute content you only need to escape & and < and either ' or ", depending which one was used as the attribute delimiter.

Entities starting with & are not recognized in comments or CDATA sections, or in element or attribute names, so special characters must not be escaped in those contexts.

like image 150
Michael Kay Avatar answered Oct 20 '22 11:10

Michael Kay