Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I prevent duplicates, in XSL?

How do I prevent duplicate entries into a list, and then ideally, sort that list? What I'm doing, is when information at one level is missing, taking the information from a level below it, to building the missing list, in the level above. Currently, I have XML similar to this:

<c03 id="ref6488" level="file">
    <did>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

I then apply the following XSL:

<xsl:template match="c03/did">
    <xsl:choose>
        <xsl:when test="not(container)">
            <did>
                <!-- If no c03 container item is found, look in the c04 level for one -->
                <xsl:if test="../c04/did/container">

                    <!-- If a c04 container item is found, use the info to build a c03 version -->
                    <!-- Skip c03 container item, if still no c04 items found -->
                    <container label="Box" type="Box">

                        <!-- Build container list -->
                        <!-- Test for more than one item, and if so, list them, -->
                        <!-- separated by commas and a space -->
                        <xsl:for-each select="../c04/did">
                            <xsl:if test="position() &gt; 1">, </xsl:if>
                            <xsl:value-of select="container"/>
                        </xsl:for-each>
                    </container>                    
            </did>
        </xsl:when>

        <!-- If there is a c03 container item(s), list it normally -->
        <xsl:otherwise>
            <xsl:copy-of select="."/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

But I'm getting the "container" result of

<container label="Box" type="Box">156, 156, 154</container>

when what I want is

<container label="Box" type="Box">154, 156</container>

Below is the full result that I'm trying to get:

<c03 id="ref6488" level="file">
    <did>
        <container label="Box" type="Box">154, 156</container>
        <unittitle>Clinic Building</unittitle>
        <unitdate era="ce" calendar="gregorian">1947</unitdate>
    </did>
    <c04 id="ref34582" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <container label="Folder" type="Folder">3</container>
        </did>
    </c04>
    <c04 id="ref6540" level="file">
        <did>
            <container label="Box" type="Box">156</container>
            <unittitle>Contact prints</unittitle>
        </did>
    </c04>
    <c04 id="ref6606" level="file">
        <did>
            <container label="Box" type="Box">154</container>
            <unittitle>Negatives</unittitle>
        </did>
    </c04>
</c03>

Thanks in advance for any help!

like image 640
LOlliffe Avatar asked Apr 19 '10 18:04

LOlliffe


People also ask

What is Cdata in xsl?

The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

How do I find duplicate values in XSLT?

1 Answer. Show activity on this post. This is very straight-forward in XSLT 2.0, using xsl:for-each-group, although you only need to select the first element in each group. Do note the use of the identity template, to copy existing elements.

What is xsl Strip space?

Used at the top level of the stylesheet to define elements in the source document for which whitespace nodes are insignificant and should be removed from the tree before processing.


1 Answers

There is no need for an XSLT 2.0 solution for this problem.

Here is an XSLT 1.0 solution, which is more compact than the currently selected XSLT 2.0 solution (35 lines vs. 43 lines):

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="kBoxContainerByVal"
     match="container[@type='Box']" use="."/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="c03/did[not(container)]">
   <xsl:copy>

   <xsl:variable name="vContDistinctValues" select=
    "/*/*/*/container[@type='Box']
            [generate-id()
            =
             generate-id(key('kBoxContainerByVal', .)[1])
            ]
            "/>

    <container label="Box" type="Box">
      <xsl:for-each select="$vContDistinctValues">
        <xsl:sort data-type="number"/>

        <xsl:value-of select=
        "concat(., substring(', ', 1 + 2*(position() = last())))"/>
      </xsl:for-each>
    </container>
    <xsl:apply-templates/>
   </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the originally provided XML document, the correct, wanted result is produced:

<c03 id="ref6488" level="file">
   <did>
      <container label="Box" type="Box">156, 154</container>
      <unittitle>Clinic Building</unittitle>
      <unitdate era="ce" calendar="gregorian">1947</unitdate>
   </did>
   <c04 id="ref34582" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <container label="Folder" type="Folder">3</container>
      </did>
   </c04>
   <c04 id="ref6540" level="file">
      <did>
         <container label="Box" type="Box">156</container>
         <unittitle>Contact prints</unittitle>
      </did>
   </c04>
   <c04 id="ref6606" level="file">
      <did>
         <container label="Box" type="Box">154</container>
         <unittitle>Negatives</unittitle>
      </did>
   </c04>
</c03>

Update:

I didn't notice the requirement that the container numbers must appear sorted. Now the solution reflects this.

like image 145
Dimitre Novatchev Avatar answered Oct 09 '22 18:10

Dimitre Novatchev