Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Html inside XML. Should I use CDATA or encode the HTML [closed]

I am using XML to share HTML content. AFAIK, I could embed the HTML either by:

  • Encoding it: I don't know if it is completely safe to use. And I would have to decode it again.

  • Use CDATA sections: I could still have problems if the content contains the closing tag "]]>" and certain hexadecimal characters, I believe. On the other hand, the XML parser would extract the info transparently for me.

Which option should I choose?

UPDATE: The xml will be created in java and passed as a string to a .net web service, were it will be parsed back. Therefore I need to be able to export the xml as a string and load it using "doc.LoadXml(xmlString);"

like image 725
alberto Avatar asked Sep 09 '09 09:09

alberto


People also ask

Should I use CDATA in XML?

A CDATA section is used to mark a section of an XML document, so that the XML parser interprets it only as character data, and not as markup. It comes handy when one XML data need to be embedded within another XML document. There are two methods to ensure that an XML file is well-formed.

Should I use CDATA?

You should almost never need to use CDATA Sections. The CDATA mechanism was designed to let an author quote fragments of text containing markup characters (the open-angle-bracket and the ampersand), for example when documenting XML (this FAQ uses CDATA Sections quite a lot, for obvious reasons).

Can I use HTML tags in XML?

You can include HTML content. One possibility is encoding it in BASE64 as you have mentioned. Another might be using CDATA tags. just remember that XML and CDATA preserve white-space.

Can you use CDATA in HTML?

CDATA is Obsolete. Note that CDATA sections should not be used within HTML; they only work in XML. So do not use it in HTML 5.


1 Answers

The two options are almost exactly the same. Here are your two choices:

<html>This is &lt;b&gt;bold&lt;/b&gt;</html>  <html><![CDATA[This is <b>bold</b>]]></html> 

In both cases, you have to check your string for special characters to be escaped. Lots of people pretend that CDATA strings don't need any escaping, but as you point out, you have to make sure that "]]>" doesn't slip in unescaped.

In both cases, the XML processor will return your string to you decoded.

like image 94
Ned Batchelder Avatar answered Sep 22 '22 18:09

Ned Batchelder