Please correct my terminology here if it's off:
The 5 character substitutions for XML are:
Do all of these substitutions need to happen in a element text? Or only attribute text? (terminology correction?)
e.g. is this valid XML?
<myelement>x && y</myelement>
<myelement>And I quote, "no"</myelement>
>
and <
seem obvious to replace in this context, but I'm not clear if the replacement rules are global for the entire XML document, or if they apply differently to different parts of the document (example, cdata sections apply different rules).
Assumption: this is invalid XML:
<myelement field="no & allowed here"/>
<myelement field="no <> allowed here"/>
Quotes are obvious delimiters of attributes, and <> are obvious delimiters of element text.
In element content you only need to escape &
and <
; you never need to escape single or double quotes, and you need to escape >
only if it appears as part of the sequence ]]>
(many people replace it unconditionally, because that's simpler).
In attribute content you only need to escape &
and <
and either '
or "
, depending which one was used as the attribute delimiter.
Entities starting with &
are not recognized in comments or CDATA sections, or in element or attribute names, so special characters must not be escaped in those contexts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With