I have a XML file in which everything is well structured except for ordered lists. Every list item is tagged as a paragraph <p>, with the enumeration added manually: (1). I want to create a valid HTML list from that source.
Using the xsl:matching-substring method and regular expressions I was able to extract every list item but I can't seem to find a way to add the surrounding <ol> tags.
Here is an example:
XML source:
<Content>
<P>(1) blah</P>
<P>(2) blah</P>
<P>(2) blah</P>
</Content>
What I have so far:
<xsl:variable name="text" select="/Content/*/text()"/>
<xsl:analyze-string select="$text" regex="(\(\d+\))([^(]*)">
<xsl:matching-substring>
<![CDATA[<li>]]><xsl:value-of select="regex-group(2)"/><![CDATA[</li>]]>
</xsl:matching-substring>
</xsl:analyze-string>
Output:
<li>blah</li>
<li>blah</li>
<li>blah</li>
In case you are wondering: output has to be plain text in general, only the contents of the $text variable have to be output in HTML. That's why I am using <![CDATA[]].
As simple as this:
I. XSLT 2.0 solution:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/*">
<ol>
<xsl:apply-templates/>
</ol>
</xsl:template>
<xsl:template match="P[matches(., '(^\(\d+\)\s*)(.*)')]">
<li>
<xsl:analyze-string select="." regex="(^\(\d+\)\s*)(.*)">
<xsl:matching-substring>
<xsl:value-of select="regex-group(2)"/>
</xsl:matching-substring>
</xsl:analyze-string>
</li>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<Content>
<P>(1) blah</P>
<P>(2) blah</P>
<P>(2) blah</P>
</Content>
the wanted, correct result is produced:
<ol>
<li>blah</li>
<li>blah</li>
<li>blah</li>
</ol>
II. XSLT 1.0 solution:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<ol>
<xsl:apply-templates/>
</ol>
</xsl:template>
<xsl:template match=
"P[starts-with(.,'(')
and
floor(substring-before(substring(.,2), ')'))
=
substring-before(substring(.,2), ')')
]">
<li>
<xsl:value-of select="substring-after(., ') ')"/>
</li>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the same XML document (above), the same correct result is produced:
<ol>
<li>blah</li>
<li>blah</li>
<li>blah</li>
</ol>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With