Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Apache Xerces/Xalan adding additional carriage returns to my serialized output?

I'm using Apache Xerces 2.11.0 and Apache Xalan 2.7.1 and I'm having problems with additional carriage return characters in the serialized XML.

I have this (pseudo) code:

String myString = ...;
Document doc = ...;

Element item = doc.createElement("item");
item.appendChild(doc.createCDATASection(myString));

Transformer transformer = ...;
ByteArrayOutputStream stream = new ByteArrayOutputStream();
Result result = new StreamResult(stream);
transformer.transform(new DOMSource(document), result);

Now myString contains line breaks (\r\n), (actually it's base64 encoded data) but when I look at the serialized output, there are additional \r characters.

Input:

Line 1 \r\n
Line 2 \r\n
Line 3 \r\n

Output:

Line 1 \r\r\n
Line 2 \r\r\n
Line 3 \r\r\n

If I use createTextNode instead of createCDATASection the output becomes even more interesting:

Line 1 
\r\n
Line 2 
\r\n
Line 3 
\r\n

The additional character seems to be introduced during serialization, the DOM tree seems to be correct. (According to getTextContent())

Why is this happening? What can I do to fix this?

like image 338
Daniel Rikowski Avatar asked Jun 11 '11 16:06

Daniel Rikowski


2 Answers

I guess your are having this problem on Windows and not on Linux/Solaris/Mac. Xalan serializer (org.apache.xml.serializer.ToStream.java) gets the line separator using System.getProperty("line.separator"). When the serializer writes \r\n, it interprets the \n as the end of line sequence and it actually writes \r+lineSeparator = \r\r\n. Although this sounds strange, this is not a bug, see [1]. But since this was frequently reported as a bug, a xalan extension property was added [2]. So you may programmatically set:

transformer.setOutputProperty("{http://xml.apache.org/xalan}line-separator","\n");

or

<xsl:output xalan:line-separator="&#10;" />

where xalan is a prefix associated with the URL "http://xml.apache.org/xalan".

[1] https://issues.apache.org/jira/browse/XALANJ-1660

[2] https://issues.apache.org/jira/browse/XALANJ-2093

like image 83
Alex Giotis Avatar answered Sep 27 '22 19:09

Alex Giotis


Odd, but try doing transformer.setOutputProperty(javax.xml.transform.OutputKeys.INDENT, "no"); immediately after creating the transformer and see what happens.

like image 23
Femi Avatar answered Sep 27 '22 18:09

Femi