I need to save content that containing newlines in some XML attributes, not text. The method should be picked so that I am able to decode it in XSLT 1.0/ESXLT/XSLT 2.0
What is the best encoding method?
Please suggest/give some ideas.
You can use 
 for line feed (LF) or 
 for carriage return (CR), and an XML parser will replace it with the respective character when handing off the parsed text to an application.
It's generally considered bad practice to rely on linebreaks, since it's a fragile way to differentiate data. While most XML processors will preserve any whitespace you put in your XML, it's not guaranteed.
Use <br> tags to force line breaks on screen.
In a compliant DOM API there is nothing you need to do. Simply save actual newline characters to the attribute, the API will encode them correctly on its own (see Canonical XML spec, section 5.2).
If you do your own encoding (i.e. replacing \n
with
before saving the attribute value), the API will encode your input again, resulting in &#10;
in the XML file.
Bottom line is, the string value is saved verbatim. You get out what you put in, no need to interfere.
However… some implementations are not compliant. For example, they will encode &
characters in attribute values, but forget about newline characters or tabs. This puts you in a losing position since you can't simply replace newlines with
beforehand.
These implementations will save newline characters unencoded, like this:
<xml attribute="line 1 line 2" />
Upon parsing such a document, literal newlines in attributes are normalized into a single space (again, in accordance to the spec) - and thus they are lost.
Saving (and retaining!) newlines in attributes is impossible in these implementations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With