I am trying to develop an XSLT custom function that could return node set or an XML fragment, let's say something like:
Input document:
<root>
<!--
author: blablabla
usage: more blablabla
labelC: [in=2] <b>formatted</b> blablabla
-->
<tag1 name="first">
<tag2>content a</tag2>
<tag2>content b</tag2>
<tag3 attrib="val">content c</tag3>
</tag1>
<!--
author: blebleble
usage: more blebleble
labelC: blebleble
-->
<tag1 name="second">
<tag2>content x</tag2>
<tag2>content y</tag2>
<tag3 attrib="val">content z</tag3>
</tag1>
</root>
So that an XSLT template such as:
<xsl:template match="//tag1/preceding::comment()[1]" xmlns:d="java:com.dummy.func">
<section>
<para>
<xsl:value-of select="d:genDoc(.)"/>
</para>
</section>
</xsl:template>
Would produce:
<section>
<para>
<author>blablabla</author>
<usage>more blablabla</usage>
<labelC in="2"><b>formatted</b> blablabla</labelC>
</para>
</section>
When matched on the first occurrence of tag1 and
<section>
<para>
<author>blebleble</author>
<usage>more blebleble</usage>
<labelC>blebleble</labelC>
</para>
</section>
When matched on the second occurrence.
Basically what I want to achieve with this custom function is to parse some meta-data present in the comments and use it to generate XML.
I found some examples online, one at: http://cafeconleche.org/books/xmljava/chapters/ch17s03.html
According to the example, my function should return one of the following
org.w3c.dom.traversal.NodeIterator,
org.apache.xml.dtm.DTM,
org.apache.xml.dtm.DTMAxisIterator,
org.apache.xml.dtm.DTMIterator,
org.w3c.dom.Node and its subtypes (Element, Attr, etc),
org.w3c.dom.DocumentFragment
I was able to implement a function returning the XML as simple type String. This, however poses several other problems: the main being the markers characters get escaped when inserted in the original XML.
Does anybody have an example of how to implement such function? I am mostly interested in how to return a proper XML node set to the calling template.
The below may get you a long way along the road you want to go. Note that this requires XSLT 2.0 version (in XSLT 1.0 it will be possible too, when supplying a replacement function for tokenize
). Also note that this assumes a specific comment contents structure.
Explanation: comments are first split up into rows (delimiter & #xD;
which is a line-feed), then in tag+value (delimiter ":", splitting into author, usage, labelC, the order is not important here), then in attributes and value for labelC (delimiter "] ", recognizing attributes as starting with "[").
Note that a lot of whitespace-wiping is done using normalize-space()
.
Edited: xslt version with function see at the bottom
XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:apply-templates/>
</output>
</xsl:template>
<xsl:template match="tag1/*">
</xsl:template>
<xsl:template match="comment()">
<section>
<para>
<xsl:for-each select="tokenize(., '
')[string-length() != 0]">
<xsl:variable name="splitup" select="tokenize(normalize-space(current()), ':')"/>
<xsl:choose>
<xsl:when test="$splitup[1]='author'">
<author><xsl:value-of select="normalize-space($splitup[2])"/></author>
</xsl:when>
<xsl:when test="$splitup[1]='usage'">
<usage><xsl:value-of select="normalize-space($splitup[2])"/></usage>
</xsl:when>
<xsl:when test="$splitup[1]='labelC'">
<labelC>
<xsl:for-each select="tokenize($splitup[2], '] ')[string-length() != 0]">
<xsl:variable name="labelCpart" select="normalize-space(current())"/>
<xsl:choose>
<xsl:when test="substring($labelCpart, 1,1) = '['">
<xsl:variable name="attr" select="tokenize(substring($labelCpart, 2), '=')"/>
<xsl:attribute name="{$attr[1]}"><xsl:value-of select="$attr[2]"/></xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$labelCpart"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</labelC>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</para>
</section>
</xsl:template>
</xsl:stylesheet>
when applied to the following XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<!--
author: blablabla
usage: more blablabla
labelC: [in=2] <b>formatted</b> blablabla
-->
<tag1 name="first">
<tag2>content a</tag2>
<tag2>content b</tag2>
<tag3 attrib="val">content c</tag3>
</tag1>
<!--
author: blebleble
usage: more blebleble
labelC: blebleble
-->
<tag1 name="second">
<tag2>content x</tag2>
<tag2>content y</tag2>
<tag3 attrib="val">content z</tag3>
</tag1>
</root>
gives the following output
<?xml version="1.0" encoding="UTF-8"?>
<output>
<section>
<para>
<author>blablabla</author>
<usage>more blablabla</usage>
<labelC in="2"><b>formatted</b> blablabla</labelC>
</para>
</section>
<section>
<para>
<author>blebleble</author>
<usage>more blebleble</usage>
<labelC>blebleble</labelC>
</para>
</section>
</output>
EDITED xslt with function call (gives the same output)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:d="java:com.dummy.func"
exclude-result-prefixes="d">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:apply-templates/>
</output>
</xsl:template>
<xsl:template match="tag1/*">
</xsl:template>
<xsl:function name="d:section">
<xsl:param name="comm"/>
<section>
<para>
<xsl:for-each select="tokenize($comm, '
')[string-length() != 0]">
<xsl:variable name="splitup" select="tokenize(normalize-space(current()), ':')"/>
<xsl:choose>
<xsl:when test="$splitup[1]='author'">
<author><xsl:value-of select="normalize-space($splitup[2])"/></author>
</xsl:when>
<xsl:when test="$splitup[1]='usage'">
<usage><xsl:value-of select="normalize-space($splitup[2])"/></usage>
</xsl:when>
<xsl:when test="$splitup[1]='labelC'">
<labelC>
<xsl:for-each select="tokenize($splitup[2], '] ')[string-length() != 0]">
<xsl:variable name="labelCpart" select="normalize-space(current())"/>
<xsl:choose>
<xsl:when test="substring($labelCpart, 1,1) = '['">
<xsl:variable name="attr" select="tokenize(substring($labelCpart, 2), '=')"/>
<xsl:attribute name="{$attr[1]}"><xsl:value-of select="$attr[2]"/></xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$labelCpart"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</labelC>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</para>
</section>
</xsl:function>
<xsl:template match="comment()">
<xsl:copy-of select="d:section(.)"/>
</xsl:template>
</xsl:stylesheet>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With