Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Truncate XML with XSLT

I have a question for the clever people of the SO community.

Below is a snippet of XML generated by the Symphony CMS.

   <news>
        <entry>
            <title>Lorem Ipsum</title>
            <body>
                <p><strong>Lorem Ipsum</strong></p>
                <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed malesuada auctor magna. Vivamus urna justo, pulvinar nec, sagittis malesuada, accumsan in, massa. Quisque mi purus, gravida eget, ultricies a, porta in, sem. Maecenas justo elit, elementum vel, feugiat vulputate, pulvinar nec, velit. Fusce vel ante et diam bibendum euismod. Nunc vel nulla non lorem dignissim placerat. Nulla magna massa, auctor et, tempor nec, auctor sit amet, turpis. Quisque odio lacus, auctor at, posuere id, suscipit eget, dui. Phasellus aliquam. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Proin varius. Phasellus cursus. Cras mattis adipiscing turpis. Sed.</p>
                <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed malesuada auctor magna.</p>
            </body>
        </entry>
    </news>

What I need to do is take a portion of the <body> element, based on a specified length, for display in the blog style of:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed malesuada auctor magna. Vivamus urna justo, pulvinar nec, sagittis malesuada, accumsan in, massa. Quisque mi purus, gravida eget, ultricies a, porta in, sem... more

...where more is a link to the full news item. I know I can select specific paragraphs and I also know I can use the substring function to bring a specified number of characters. However, I need to preserve the formatting of the text, i.e. the HTML tags within the <body> element.

I realise this raises issues of tag closure but there must surely be a way. Hopefully someone more experienced with XSLT can shed some light on this issue.

like image 640
Neil Albrock Avatar asked Feb 10 '09 12:02

Neil Albrock


1 Answers

Here's my version. I've tested it over your XML sample and it works.

To invoke it, use <xsl:apply-templates select="path/to/body/*" mode="truncate"/>.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:strip-space elements="*"/>

<!-- limit: the truncation limit -->
<xsl:variable name="limit" select="250"/>

<!-- t: Total number of characters in the set -->
<xsl:variable name="t" select="string-length(normalize-space(//body))"/>

<xsl:template match="*" mode="truncate">
    <xsl:variable name="preceding-strings">
        <xsl:copy-of select="preceding::text()[ancestor::body]"/>
    </xsl:variable>

    <!-- p: number of characters up to the current node -->
    <xsl:variable name="p" select="string-length(normalize-space($preceding-strings))"/>

    <xsl:if test="$p &lt; $limit">
        <xsl:element name="{name()}">
            <xsl:apply-templates select="@*" mode="truncate"/>
            <xsl:apply-templates mode="truncate"/>
        </xsl:element>
    </xsl:if>
</xsl:template>

<xsl:template match="text()" mode="truncate">
    <xsl:variable name="preceding-strings">
        <xsl:copy-of select="preceding::text()[ancestor::body]"/>
    </xsl:variable>

    <!-- p: number of characters up to the current node -->
    <xsl:variable name="p" select="string-length(normalize-space($preceding-strings))"/>

    <!-- c: number of characters including current node -->
    <xsl:variable name="c" select="$p + string-length(.)"/>

    <xsl:choose>
        <xsl:when test="$limit &lt;= $c">
            <xsl:value-of select="substring(., 1, ($limit - $p))"/>
            <xsl:text>&#8230;</xsl:text>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

<xsl:template match="@*" mode="truncate">
    <xsl:attribute name="{name(.)}"><xsl:value-of select="."/></xsl:attribute>
</xsl:template>

</xsl:stylesheet>
like image 133
Chaotic Pattern Avatar answered Sep 21 '22 00:09

Chaotic Pattern