I cannot have any nested spans, so I need to flatten them and concatenate their class attributes so I can track which classes are parents.
Here's a simplified input:
<body>
<h1 class="section">Title</h1>
<p class="main">
ZZZ
<span class="a">
AAA
<span class="b">
BBB
<span class="c">
CCC
<preserveMe>
eeee
</preserveMe>
</span>
bbb
<preserveMe>
eeee
</preserveMe>
</span>
aaa
</span>
</p>
</body>
Here's the desired output
<body>
<h1 class="section">Title</h1>
<p class="main">
ZZZ
<span class="a">
AAA
</span>
<span class="ab">
BBB
</span>
<span class="abc">
CCC
<preserveMe>
eeee
</preserveMe>
</span>
<span class="ab">
bbb
<preserveMe>
eeee
</preserveMe>
</span>
<span class="a">
aaa
</span>
</p>
</body>
Here's the closest I've come (I'm really new to this, so even getting this far took me a long time...)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<p>
<xsl:apply-templates/>
</p>
</xsl:template>
<xsl:template match="*/span">
<span class='{concat(../../@class,../@class,@class)}'>
<xsl:value-of select='.'/>
</span>
<xsl:apply-templates/>
</xsl:template>
</xsl:stylesheet>
You can see the result of my failed attempt and how far it is from what I really wanted if you run it yourself. Ideally, I'd like a solution that accepts an arbitrary number of nested levels and can also handle interrupted nests (span, span, notSpan, span...).
edit: I have added tags inside the nested structure per request by commenters below. Also, I'm using XSLT v1.0, but I could use other versions if needed I suppose.
edit 2: I realized that my example was over-simlified compared to what I actually need to convert. Namely, I cannot lose classes from other tags; only spans can be combined.
As I mentioned in the opening comments, this is far from being trivial. Here's another approach you may consider:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p">
<xsl:copy>
<xsl:apply-templates select="@*|node()|.//span/text()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="span/text()">
<span>
<xsl:attribute name="class">
<xsl:for-each select="ancestor::span">
<xsl:value-of select="@class"/>
</xsl:for-each>
</xsl:attribute>
<xsl:apply-templates select="preceding-sibling::*"/>
<xsl:value-of select="." />
<xsl:if test="not(following-sibling::text())">
<xsl:apply-templates select="following-sibling::*"/>
</xsl:if>
</span>
</xsl:template>
<xsl:template match="span"/>
</xsl:stylesheet>
This is to a large extent similar to what was suggested earlier by Lingamurthy CS - but you will see a difference with the following test input:
XML
<body>
<h1 class="section">Title</h1>
<p class="main">
ZZZ
<preserveMe>0</preserveMe>
<span class="a">
AAA
<span class="b">
BBB
<span class="c">
CCC
<preserveMe>c</preserveMe>
</span>
bbb
<preserveMe>b</preserveMe>
</span>
aaa
</span>
<preserveMe>1</preserveMe>
</p>
</body>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With