Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML to CSV using XSLT help

Tags:

csv

xslt

I'd like to convert XML into CSV using an XSLT, but when applying the XSL from the SO thread titled XML To CSV XSLT against my input:

<WhoisRecord>
  <DomainName>127.0.0.1</DomainName>
  <RegistryData>
    <AbuseContact>
      <Email>[email protected]</Email>
      <Name>Internet Corporation for Assigned Names and Number</Name>
      <Phone>+1-310-301-5820</Phone>
    </AbuseContact>
    <AdministrativeContact i:nil="true"/>
    <BillingContact i:nil="true"/>
    <CreatedDate/>
    <RawText>...</RawText>
    <Registrant>
      <Address>4676 Admiralty Way, Suite 330</Address>
      <City>Marina del Rey</City>
      <Country>US</Country>
      <Name>Internet Assigned Numbers Authority</Name>
      <PostalCode>90292-6695</PostalCode>
      <StateProv>CA</StateProv>
    </Registrant>
    <TechnicalContact>
      <Email>[email protected]</Email>
      <Name>Internet Corporation for Assigned Names and Number</Name>
      <Phone>+1-310-301-5820</Phone>
    </TechnicalContact>
    <UpdatedDate>2010-04-14</UpdatedDate>
    <ZoneContact i:nil="true"/>
  </RegistryData>
</WhoisRecord>

I end up with:

  [email protected] Corporation for Assigned Names and Number+1-310-301-5820,
    ,
    ,
    ,
    ...,      
    4676 Admiralty Way, Suite 330Marina del ReyUSInternet Assigned Numbers Authority90292-6695CA,      
    [email protected] Corporation for Assigned Names and Number+1-310-301-5820,      
    2010-04-14,

My problem is that, the resulting transformation is missing nodes (like the DomainName element containing the IP address) and some child nodes are concatenated without commas (like the children of AbuseContact).

I'd like to see all the XML output in CSV form, and strings like: "[email protected] Corporation for Assigned Names and Number+1-310-301-5820," delimited by commas.

My XSL is pretty rusty. Your help is appreciated. :)

Here's the XSL I'm using:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="iso-8859-1"/>

<xsl:strip-space elements="*" />

<xsl:template match="/*/child::*">
  <xsl:for-each select="child::*">
    <xsl:if test="position() != last()"><xsl:value-of select="normalize-space(.)"/>,    </xsl:if>
    <xsl:if test="position()  = last()"><xsl:value-of select="normalize-space(.)"/><xsl:text>
</xsl:text>
  </xsl:if>
  </xsl:for-each>
</xsl:template>

</xsl:stylesheet>
like image 556
Adam Kahtava Avatar asked May 17 '10 16:05

Adam Kahtava


People also ask

Can XSLT transform XML to CSV?

The following XSL Style Sheet (compatible with XSLT 1.0) can be used to transform the XML into CSV. It is quite generic and can easily be configured to handle different xml elements by changing the list of fields defined ar the beginning.

Is there any benefit of converting XML to XSLT?

XSLT is commonly used to convert XML to HTML, but can also be used to transform XML documents that comply with one XML schema into documents that comply with another schema. XSLT can also be used to convert XML data into unrelated formats, like comma-delimited text or formatting languages such as troff.

How XSLT works with XML?

XSLT is used to transform XML document from one form to another form. XSLT uses Xpath to perform matching of nodes to perform these transformation . The result of applying XSLT to XML document could be an another XML document, HTML, text or any another document from technology perspective.


1 Answers

This simple transformation produces the wanted result:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>
 <xsl:strip-space elements="*"/>

    <xsl:template match="/">
    <xsl:apply-templates select="//text()"/>
    </xsl:template>

    <xsl:template match="text()">
      <xsl:copy-of select="."/>
      <xsl:if test="not(position()=last())">,</xsl:if>
    </xsl:template>
</xsl:stylesheet>

Do note the use of:

 <xsl:strip-space elements="*"/>

to discard any white-space-only text nodes.

Update: AJ raised the problem that the results shoud be grouped in recirds/tuples per line. It isn't defined in the question what a record/tuple should exactly be. Therefore the current solution solves the two problems of white-space-only text nodes and of missing commas, but does not aim to grop the output into records/tuples.

like image 193
Dimitre Novatchev Avatar answered Sep 19 '22 08:09

Dimitre Novatchev