Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert XML into another XML via XSLT

Tags:

xml

xslt

I have a XML file as following and I want to convert it into another XML file.

<body>
  <outline text="A">
    <outline text="Abelson, Harold" author="Harold Abelson" title="Struktur und Interpretation von Computerprogrammen. Eine Informatik-Einführung" publisher="Springer Verlag" isbn="3540520430" year="1991"/>
    <outline text="Abrahams, Paul W." author="Paul W. Abrahams" title="Tex for the Impatient" publisher="Addison-Wesley Pub Co" isbn="0201513757" year="2000"/>
  </outline>
  <outline text="B">
    <outline text="Bach, Fred" author="Fred Bach" title="UNIX Handbuch zur Programmentwicklung" publisher="Hanser Fachbuchverlag" isbn="3446151036"/>
    <outline text="Bach, Maurice J." author="Maurice J. Bach" title="Design of the UNIX Operating System" publisher="Prentice Hall PTR" isbn="0132017997" year="1986"/>
  </outline>
</body>

Here is XML format that I want to convert to

<list>
    <books text="A">
        <book>
            <text>Abelson, Harold</text>
            <author>Harold Abelson</author>
            <title>Struktur und Interpretation von Computerprogrammen. Eine
                Informatik-Einführung</title>
            <publisher>Springer Verlag</publisher>
            <isbn>3540520430</isbn>
            <year>1991</year>
        </book>
        <book>
            <text>Abrahams, Paul W.</text>
            <author>Paul W. Abrahams</author>
            <title>Tex for the Impatient</title>
            <publisher>Addison-Wesley Pub Co</publisher>
            <isbn>0201513757</isbn>
            <year>2000</year>
        </book>

    </books>
    <books text="B">
        <book>
            <text>Bach, Fred</text>
            <author>Fred Bach</author>
            <title>UNIX Handbuch zur Programmentwicklung</title>
            <publisher>Hanser Fachbuchverlag</publisher>
            <isbn>3446151036</isbn>
            <year />
        </book>

Here is my code

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml"/>
    <xsl:template match="body">
        <list> <xsl:apply-templates select="outline"/> </list>
    </xsl:template>

    <xsl:template match="outline">
    <books text= "{@text}">
        <book><xsl:apply-templates select="outline"/> 
        <text><xsl:value-of select="@text" /></text>
        <author><xsl:value-of select="@author" /></author>
        <title><xsl:value-of select="@title" /></title>
        <publisher><xsl:value-of select="@publisher" /></publisher>
        <isbn><xsl:value-of select="@isbn" /></isbn>
        <year><xsl:value-of select="@year" /></year>
        </book>
    </books>
    </xsl:template>


</xsl:stylesheet>

and here is the output of my code.

<list>
    <books text="A">
        <book>
            <books text="Abelson, Harold">
                <book>
                    <text>Abelson, Harold</text>
                    <author>Harold Abelson</author>
                    <title>Struktur und Interpretation von Computerprogrammen. Eine
                        Informatik-Einführung</title>
                    <publisher>Springer Verlag</publisher>
                    <isbn>3540520430</isbn>
                    <year>1991</year>
                </book>
            </books>

There are two extra elements in my output

<books text="Abelson, Harold">
                <book>

as my knowledge this maybe caused by this line of code. I tried few different way, but didn't work

<xsl:template match="outline">
        <books text= "{@text}">

Additional Question: If the original XML files contains title. How to eliminate the head and title. My current code produce "tmp" in the new XML file.

<opml version="1.0">
  <head>
    <title>tmp</title>
    <expansionState></expansionState>
  </head>
  <body>
      <outline text="A">
      <outline text="Abelson, Harold" author="H
like image 747
Michael Avatar asked Jan 14 '23 10:01

Michael


2 Answers

You were on the right track, but you need two outline templates - one for the top level outlines, and one for the child outlines.

Please replace your outline template with these three:

  <xsl:template match="head" />

  <xsl:template match="outline">
    <books text="{@text}">
      <xsl:apply-templates select="outline" />
    </books>
  </xsl:template>

  <xsl:template match="outline/outline">
    <book>
      <text>
        <xsl:value-of select="@text" />
      </text>
      <author>
        <xsl:value-of select="@author" />
      </author>
      <title>
        <xsl:value-of select="@title" />
      </title>
      <publisher>
        <xsl:value-of select="@publisher" />
      </publisher>
      <isbn>
        <xsl:value-of select="@isbn" />
      </isbn>
      <year>
        <xsl:value-of select="@year" />
      </year>
    </book>
  </xsl:template>

And if it can be safely assumed that the names of the attributes of the source document will match the names of the elements in the output document, you can replace that second template with these two shorter, more streamlined ones:

  <xsl:template match="outline/outline">
    <book>
      <xsl:apply-templates select="@text" />
      <xsl:apply-templates select="@author" />
      <xsl:apply-templates select="@title" />
      <xsl:apply-templates select="@publisher" />
      <xsl:apply-templates select="@isbn" />
      <xsl:apply-templates select="@year" />
    </book>
  </xsl:template>

  <xsl:template match="outline/outline/@*">
    <xsl:element name="{name()}">
       <xsl:value-of select="." />
    </xsl:element>
  </xsl:template>
like image 140
JLRishe Avatar answered Jan 17 '23 00:01

JLRishe


Here is a slightly more-push oriented solution that doesn't rely quite as much on manually pulling elements from the source XML into the result XML. In particular, notice the last template.

When this XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" indent="yes" />
  <xsl:strip-space elements="*" />

  <xsl:template match="/*">
    <list>
      <xsl:apply-templates />
    </list>
  </xsl:template>

  <xsl:template match="outline[outline]">
    <books text="{@text}">
      <xsl:apply-templates />
    </books>
  </xsl:template>

  <xsl:template match="outline[not(*)]">
    <book>
      <xsl:apply-templates select="@*" />
    </book>
  </xsl:template>

  <xsl:template match="outline[not(*)]/@*">
    <xsl:element name="{name()}">
      <xsl:value-of select="." />
    </xsl:element>
  </xsl:template>

</xsl:stylesheet>

...is applied against the provided source XML:

<body>
  <outline text="A">
    <outline text="Abelson, Harold" author="Harold Abelson" title="Struktur und Interpretation von Computerprogrammen. Eine Informatik-Einführung" publisher="Springer Verlag" isbn="3540520430" year="1991"/>
    <outline text="Abrahams, Paul W." author="Paul W. Abrahams" title="Tex for the Impatient" publisher="Addison-Wesley Pub Co" isbn="0201513757" year="2000"/>
  </outline>
  <outline text="B">
    <outline text="Bach, Fred" author="Fred Bach" title="UNIX Handbuch zur Programmentwicklung" publisher="Hanser Fachbuchverlag" isbn="3446151036"/>
    <outline text="Bach, Maurice J." author="Maurice J. Bach" title="Design of the UNIX Operating System" publisher="Prentice Hall PTR" isbn="0132017997" year="1986"/>
  </outline>
</body>

...the wanted result is produced:

<list>
  <books text="A">
    <book>
      <text>Abelson, Harold</text>
      <author>Harold Abelson</author>
      <title>Struktur und Interpretation von Computerprogrammen.
        Eine Informatik-Einführung</title>
      <publisher>Springer Verlag</publisher>
      <isbn>3540520430</isbn>
      <year>1991</year>
    </book>
    <book>
      <text>Abrahams, Paul W.</text>
      <author>Paul W. Abrahams</author>
      <title>Tex for the Impatient</title>
      <publisher>Addison-Wesley Pub Co</publisher>
      <isbn>0201513757</isbn>
      <year>2000</year>
    </book>
  </books>
  <books text="B">
    <book>
      <text>Bach, Fred</text>
      <author>Fred Bach</author>
      <title>UNIX Handbuch zur Programmentwicklung</title>
      <publisher>Hanser Fachbuchverlag</publisher>
      <isbn>3446151036</isbn>
    </book>
    <book>
      <text>Bach, Maurice J.</text>
      <author>Maurice J. Bach</author>
      <title>Design of the UNIX Operating System</title>
      <publisher>Prentice Hall PTR</publisher>
      <isbn>0132017997</isbn>
      <year>1986</year>
    </book>
  </books>
</list>
like image 31
ABach Avatar answered Jan 17 '23 00:01

ABach