Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse an XML DOM inside a CDATA element in XSLT?

say I have an XML file like:

<library>
 <books>
  <![CDATA[<genre><name>Sci-fi</name><count>2</count></genre>]]>
  <book>
   <name>
    Some Book
   </name>
   <author>
    Some author
   </author>
  <book>
  <book>
   <name>
    Another Book
   </name>
   <author>
    Another author
   </author>
  <book>
 <books>
</library>

I want to read the CDATA element 'name' in an xslt transformer and place its value somewhere in the vaue of a tag. How do I do this? AFAIK, we cannot use xpath on the contents of CDATA. Is there some hack/workaround for this? I want to do this strictly in an XSLT.

like image 797
r3st0r3 Avatar asked Apr 25 '12 20:04

r3st0r3


People also ask

How do I add CDATA to XML?

CDATA sections can appear inside element content and allow < and & character literals to appear. A CDATA section begins with the character sequence <! [CDATA[ and ends with the character sequence ]]>. Between the two character sequences, an XML processor ignores all markup characters such as <, >, and &.

What is CDATA in XSL?

The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

What is the CDATA in XML and why is it used?

A CDATA section contains text that will NOT be parsed by a parser. Tags inside a CDATA section will NOT be treated as markup and entities will not be expanded. The primary purpose is for including material such as XML fragments, without needing to escape all the delimiters.


2 Answers

Some XSLT products have an extension function, for example saxon:parse() that allow you to take a string containing lexical XML and convert it into a tree of nodes.

like image 81
Michael Kay Avatar answered Sep 28 '22 22:09

Michael Kay


You could also select out the CDATA section and then pass the result to a second XSL.

For instance if you get the CDATA section out like this:

<xsl:template match="//books/text()">
  <xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>

You would end up with a result like:

<genre><name>Sci-fi</name><count>2</count></genre>

which you could then apply another XSL to, or XPATH if dealing with just a DOM. That is assuming that your CDATA is always valid XML. Otherwise, the RegEx answer by Martin is the way.

like image 27
Jayson Lorenzen Avatar answered Sep 29 '22 00:09

Jayson Lorenzen