Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the reason that CDATA even exists?

Tags:

xml

cdata

xslt

I often see people asking XML/XSLT related questions here that root in the inability to grasp how CDATA works (like this one).

I wonder - why does it exist in the first place? It's not that XML could not do without it, everything you can put into a CDATA section can be expressed as "native" (XML-escaped).

I appreciate that CDATA potentially makes the resulting document a bit smaller, but let's face it - XML is verbose anyway. Small XML documents can be achieved more easily through compression, for example.

For me, CDATA breaks the clean separation of markup and data since you can have data that looks like markup to the unaided eye, which I find is a bad thing. (This may even be one of the things that encourages people to inadequately apply string processing or regex to XML.)

So: What good reason is there to use CDATA?

like image 686
Tomalak Avatar asked Nov 11 '09 10:11

Tomalak


People also ask

Why do we need CDATA?

A CDATA section is used to mark a section of an XML document, so that the XML parser interprets it only as character data, and not as markup. It comes handy when one XML data need to be embedded within another XML document. There are two methods to ensure that an XML file is well-formed.

What is the meaning of CDATA?

The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

Why is CDATA used in XML?

XML Character data (CDATA) is defined as Blocks of texts and a type of XML Node recognized by the mark-up languages but are not parsed by the parsers. This is used to solve the inclusion of the mathematical term in the XML document. To pass a math equation <,> CDATA is used to include in the code section.

What is CDATA section give an example?

The term CDATA means, Character Data. CDATA is defined as blocks of text that are not parsed by the parser, but are otherwise recognized as markup. The predefined entities such as &lt;, &gt;, and &amp; require typing and are generally difficult to read in the markup.


2 Answers

CDATA sections are just for the convenience of human authors, not for programs. Their only use is to give humans the ability to easily include e.g. SVG example code in an XHTML page without needing to carefully replacing every < with &lt; and so on.

That is for me the intended use. Not to make the resulting document a few bytes smaller because you can use < instead of &lt;.

Also again taking the sample from above (SVG code in xhtml) it makes it easy for me to check the source code of the XHTML file and just copy-paste the SVG code out without again needing to back-replace &lt; with <.

like image 86
jitter Avatar answered Nov 24 '22 06:11

jitter


PCDATA - parsed character data which means the data entered will be parsed by the parser.

CDATA - the data entered between CDATA elements will not be parsed by the parser.that is the text inside the CDATA section will be ignored by the parser. as a result a malicious user can sent destroying data to the application using these CDATA elements.

CDATA section starts with <![CDATA[ and ends with ]]>.

The only string that cannot occur in CDATA is ]]>.

The only reason why we use CDATA is: text like Javascript code contains lot of <, & characters. To avoid errors, script code can be defined as CDATA, because using < alone will generate an error, as parser interprets it as the start of new element. Similarly & can be interpreted as a start of the character entity by the parser.

like image 41
Madhan Avatar answered Nov 24 '22 08:11

Madhan