Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I write unescaped XML outside of a CDATA

I am trying to write XML data using Stax where the content itself is HTML

If I try

xtw.writeStartElement("contents");
xtw.writeCharacters("<b>here</b>");
xtw.writeEndElement();

I get this

<contents>&lt;b&gt;here&lt;/b&gt;</contents>

Then I notice the CDATA method and change my code to:

xtw.writeStartElement("contents");
xtw.writeCData("<b>here</b>");
xtw.writeEndElement();

and this time the result is

<contents><![CDATA[<b>here</b>]]></contents>

which is still not good. What I really want is

<contents><b>here</b></contents>

So is there an XML API/Library that allows me to write raw text without being in a CDATA section? So far I have looked at Stax and JDom and they do not seem to offer this.

In the end I might resort to good old StringBuilder but this would not be elegant.

Update:

I agree mostly with the answers so far. However instead of <b>here</b> I could have a 1MB HTML document that I want to embed in a bigger XML document. What you suggest means that I have to parse this HTML document in order to understand its structure. I would like to avoid this if possible.

Answer:

It is not possible, otherwise you could create invalid XML documents.

like image 433
kazanaki Avatar asked Jun 08 '10 10:06

kazanaki


1 Answers

The issue is that is not raw text it is an element so you should be writing

xtw.writeStartElement("contents");
xtw.writeStartElement("b");
xtw.writeCData("here");
xtw.writeEndElement();
xtw.writeEndElement();
like image 65
mmmmmm Avatar answered Nov 07 '22 01:11

mmmmmm