I often see people asking XML/XSLT related questions here that root in the inability to grasp how CDATA works (like this one).
I wonder - why does it exist in the first place? It's not that XML could not do without it, everything you can put into a CDATA section can be expressed as "native" (XML-escaped).
I appreciate that CDATA potentially makes the resulting document a bit smaller, but let's face it - XML is verbose anyway. Small XML documents can be achieved more easily through compression, for example.
For me, CDATA breaks the clean separation of markup and data since you can have data that looks like markup to the unaided eye, which I find is a bad thing. (This may even be one of the things that encourages people to inadequately apply string processing or regex to XML.)
So: What good reason is there to use CDATA?
A CDATA section is used to mark a section of an XML document, so that the XML parser interprets it only as character data, and not as markup. It comes handy when one XML data need to be embedded within another XML document. There are two methods to ensure that an XML file is well-formed.
The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.
XML Character data (CDATA) is defined as Blocks of texts and a type of XML Node recognized by the mark-up languages but are not parsed by the parsers. This is used to solve the inclusion of the mathematical term in the XML document. To pass a math equation <,> CDATA is used to include in the code section.
The term CDATA means, Character Data. CDATA is defined as blocks of text that are not parsed by the parser, but are otherwise recognized as markup. The predefined entities such as <, >, and & require typing and are generally difficult to read in the markup.
CDATA
sections are just for the convenience of human authors, not for programs. Their only use is to give humans the ability to easily include e.g. SVG example code in an XHTML page without needing to carefully replacing every <
with <
and so on.
That is for me the intended use. Not to make the resulting document a few bytes smaller because you can use <
instead of <
.
Also again taking the sample from above (SVG code in xhtml) it makes it easy for me to check the source code of the XHTML file and just copy-paste the SVG code out without again needing to back-replace <
with <
.
PCDATA - parsed character data which means the data entered will be parsed by the parser.
CDATA - the data entered between CDATA elements will not be parsed by the parser.that is the text inside the CDATA section will be ignored by the parser. as a result a malicious user can sent destroying data to the application using these CDATA elements.
CDATA section starts with <![CDATA[
and ends with ]]>
.
The only string that cannot occur in CDATA is ]]>
.
The only reason why we use CDATA is: text like Javascript code contains lot of <
, & characters. To avoid errors, script code can be defined as CDATA, because using <
alone will generate an error, as parser interprets it as the start of new element. Similarly &
can be interpreted as a start of the character entity by the parser.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With