Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert node into another XML, create new elements (or update existing ones) and reorder resulting document

Well, posting my first question even if I'm assiduously use the site. I've been trying to get a solution for this for the last two days without success. Using some of the answers to similar questions on this site (this, this, this, this and many, many others) I've been able to make some progress but the complete (and correct!) solution still escapes me.

I have an existing XML (file1.xml) that I have to update based in another one I'm generating (file2.xml): content of file2 has to be included on file1 respecting some rules I'll state later, (content of files has been oversimplified to show only relevant elements) :

file1.xml

<?xml version="1.0" encoding="UTF-8"?>
<list>
    <decade lastyear="2012" firstyear="2011">
        <year value="2012">
            <issue year="2012"  number="242" />
            <issue year="2012"  number="241" />
            <issue year="2012"  number="240" />
        </year>
        <year value="2011">
            <issue year="2011"  number="238" />
            <issue year="2011"  number="237" />
            <issue year="2011"  number="236" />
            <issue year="2011"  number="235" />
        </year>
    </decade>
    <decade lastyear="2010" firstyear="2001">
        <year value="2010">
            <issue year="2010"  number="234" />
            <issue year="2010"  number="233" />
            <issue year="2010"  number="232" />
            <issue year="2010"  number="231" />
            <issue year="2010"  number="230" />
        </year>
        <year value="2009">
            <issue year="2009"  number="229" />
            <issue year="2009"  number="228" />
            <issue year="2009"  number="227" />
            <issue year="2009"  number="226" />
            <issue year="2009"  number="225" />
        </year>
           ...
    </decade>
 </list>

file2.xml

<?xml version="1.0" encoding="UTF-8"?>
<issue year="2013" number="245" />
...

As said before, content of file2 must be inserted on file1 with some rules to be respected:

  • If the issues' year don't exist on file1 (i.e., if inserting first issue of the year), it must be created (already done)
  • the new issue must be inserted under the corresponding year (already done)
  • decade must be updated to reflect the last inserted year (having problems with this one !)
  • The issue element must be ordered in descending order by year and number
  • If the issues' year belongs to a new decade, this one has to be created along with the corresponding child year and issue(s)
  • In the resulting document, all the elements must be ordered in descending order: decade (lastyear), year (value) and issue (year and number)

I'm using Saxon-HE 9.4.0.6 and the xsl I've done until now is this one:

XSL

<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs"
    version="2.0">
    <xsl:output method="xml" omit-xml-declaration="yes" indent="no" encoding="UTF-8"/>

    <xsl:variable name="up" select="document('../test/ExcelStory/file2.xml')"/>
    <xsl:variable name="year" select="$up/issue/@year" />

    <xsl:template match="@* | node()" >
       <xsl:copy>
           <xsl:apply-templates select="@*|node()">
               <xsl:sort select="//issue/@year" />
            </xsl:apply-templates>
       </xsl:copy>
    </xsl:template>

    <xsl:template match="decade" >
        <xsl:copy>
            <xsl:apply-templates select="* | @*"/>
            <xsl:choose>
                <xsl:when test="year[1]/@value lt $year">
                    <year value="{$year}"/>
                </xsl:when>
            </xsl:choose>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="year[@value=$year]">
        <xsl:copy>
            <xsl:apply-templates select="* | @*"/>
            <xsl:apply-templates select="$up/*" />
        </xsl:copy>
    </xsl:template>    
</xsl:stylesheet>

This stylesheet assumes the content on file1.xml is already ordered when read (that's the case).

I'm wondering if I must do more than one pass using 'mode' to first create the decade according to the year (if necessary), then insert the year in the correct decade (on the second pass??), after that insert the issues on the correct year (third pass??) and finally reorder all the elements (even another pass??) or if all the required processing can be made more efficiently (one or two passes). Mr. Michael Key suggested somewhere else using xsl:for-each for this kind of processing but I don't know if it could fit better (easier?) in this case.

Even if this question may seem similar to some others on stackoverflow, I think there is some added complexity that make it worth reading (and may be answering, I hope!).

I'll be grateful if you can give some ideas about how to proceed or if you can point me to additional resources.

like image 988
oswcab Avatar asked Jan 31 '14 18:01

oswcab


1 Answers

What I would do instead of trying to add the new issue(s) is to combine all of the issues from both files and then recreate the structure.

This might not work for your actual use case because you said:

(content of files has been oversimplified to show only relevant elements)

but hopefully it gives you another perspective and/or starting point.

You will probably want to add an identity transform and replace xsl:copy-of and xsl:perform-sort with xsl:apply-templates. You will also need to update xsl:param to point to an external file.

XML Input (modified slightly to add more years and change numbering for testing)

<list>
    <decade lastyear="2012" firstyear="2011">
        <year value="2012">
            <issue year="2012"  number="242" />
            <issue year="2012"  number="241" />
            <issue year="2012"  number="240" />
        </year>
        <year value="2011">
            <issue year="2011"  number="238" />
            <issue year="2011"  number="237" />
            <issue year="2011"  number="236" />
            <issue year="2011"  number="235" />
        </year>
    </decade>
    <decade lastyear="2010" firstyear="2001">
        <year value="2010">
            <issue year="2010"  number="234" />
            <issue year="2010"  number="232" />
            <issue year="2010"  number="233" />
            <issue year="2010"  number="231" />
            <issue year="2010"  number="230" />
        </year>
        <year value="2009">
            <issue year="2009"  number="229" />
            <issue year="2009"  number="228" />
            <issue year="2009"  number="227" />
            <issue year="2009"  number="226" />
            <issue year="2009"  number="225" />
        </year>
        <year value="2001">
            <issue year="2001"  number="123" />
        </year>
    </decade>
</list>

XSLT 2.0

<xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    exclude-result-prefixes="xs">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!--This can be changed to point to an external XML file.-->
    <xsl:param name="up">
        <issue year="2013" number="245" />
        <issue year="2002" number="135" />
        <issue year="2011" number="239" />
    </xsl:param>

    <xsl:template match="/*">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="($up/issue|*/*/issue)" group-by="floor((number(@year) - 1) div 10)">
                <xsl:sort select="@year" data-type="number" order="descending"/>
                <decade lastyear="{max(current-group()/@year)}" firstyear="{min(current-group()/@year)}">
                    <xsl:for-each-group select="current-group()" group-by="@year">
                        <xsl:sort select="current-grouping-key()" data-type="number" order="descending"/>                   
                        <year value="{current-grouping-key()}">
                            <xsl:perform-sort select="current-group()">
                                <xsl:sort select="@number" data-type="number" order="descending"/>
                            </xsl:perform-sort>
                        </year>
                    </xsl:for-each-group>
                </decade>
            </xsl:for-each-group>           
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

XML Output

<list>
   <decade lastyear="2013" firstyear="2011">
      <year value="2013">
         <issue year="2013" number="245"/>
      </year>
      <year value="2012">
         <issue year="2012" number="242"/>
         <issue year="2012" number="241"/>
         <issue year="2012" number="240"/>
      </year>
      <year value="2011">
         <issue year="2011" number="239"/>
         <issue year="2011" number="238"/>
         <issue year="2011" number="237"/>
         <issue year="2011" number="236"/>
         <issue year="2011" number="235"/>
      </year>
   </decade>
   <decade lastyear="2010" firstyear="2001">
      <year value="2010">
         <issue year="2010" number="234"/>
         <issue year="2010" number="233"/>
         <issue year="2010" number="232"/>
         <issue year="2010" number="231"/>
         <issue year="2010" number="230"/>
      </year>
      <year value="2009">
         <issue year="2009" number="229"/>
         <issue year="2009" number="228"/>
         <issue year="2009" number="227"/>
         <issue year="2009" number="226"/>
         <issue year="2009" number="225"/>
      </year>
      <year value="2002">
         <issue year="2002" number="135"/>
      </year>
      <year value="2001">
         <issue year="2001" number="123"/>
      </year>
   </decade>
</list>
like image 51
Daniel Haley Avatar answered Nov 15 '22 12:11

Daniel Haley