In this thread I refer to my last thread: Convert XML to CSV using XSLT - dynamic columns.
The XSLT script in the refered thread works fine but with a large XML document the performance is not good. Now I want to write an XSLT script that outputs another XSLT script which will output the final CSV file.
Question:
How to write the first XSLT script? The output should look like the following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/*">
<xsl:text>Name;</xsl:text>
<xsl:text>Brother;</xsl:text>
<xsl:text>Sister</xsl:text>
<-- this part is dynamic -->
<xsl:apply-templates select="Person" />
</xsl:template>
<xsl:template match="Person">
<xsl:value-of select="Name" />
<xsl:value-of select="Brother" />
<xsl:value-of select="Sister" />
<-- this part is dynamic too -->
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
The input XML file is the same like in the refered thread:
<Person>
<Name>John</Name>
<FamilyMembers>
<FamilyMember>
<Name>Lisa</Name>
<Type>Sister</Type>
</FamilyMember>
<FamilyMember>
<Name>Tom</Name>
<Type>Brother</Type>
</FamilyMember>
</FamilyMembers>
</Person>
<Person>
<Name>Daniel</Name>
<FamilyMembers>
<FamilyMember>
<Name>Peter</Name>
<Type>Father</Type>
</FamilyMember>
</FamilyMembers>
</Person>
For every different type element there should be one line like the following in the resulting XSLT script:
<xsl:text>Type;</xsl:text>
The standard way to transform XML data into other formats is by Extensible Stylesheet Language Transformations (XSLT). You can use the built-in XSLTRANSFORM function to convert XML documents into HTML, plain text, or different XML schemas. XSLT uses stylesheets to convert XML into other data formats.
XSLT is designed to be used as part of XSL. In addition to XSLT, XSL includes an XML vocabulary for specifying formatting. XSL specifies the styling of an XML document by using XSLT to describe how the document is transformed into another XML document that uses the formatting vocabulary.
An accumulator defines some processing that is to take place while a document is being sequentially processed: for example, a total that is to be accumulated. Available in XSLT 3.0. From Saxon 9.8, available in all editions.
To write one XSLT that outputs another you either need to generate the output elements using <xsl:element>
, e.g.
<xsl:element name="xsl:text">
or use <xsl:namespace-alias>
if you want to use literal result elements. The XSLT spec has an example:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias">
<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>
<xsl:template match="/">
<axsl:stylesheet>
<xsl:apply-templates/>
</axsl:stylesheet>
</xsl:template>
Any <axsl:...>
elements in the stylesheet will become <xsl:...>
in the output.
Rather than a two-external-phase solution (meaning a style-sheet that writes a style-sheet that gets executed), I think you would be better served by a version of Tim's solution that performs better at scale. Please measure the performance of this solution with your 'large XML document' as input.
This XSLT 1.0 style-sheet...
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:key name="kTypes" match="Type" use="." />
<xsl:variable name="distinct-types"
select="/*/Person/FamilyMembers/FamilyMember/Type[
generate-id()=generate-id(key('kTypes',.)[1])]" />
<xsl:template match="/">
<xsl:value-of select="'Name;'" />
<xsl:for-each select="$distinct-types">
<xsl:value-of select="." />
<xsl:if test="position() < last()">
<xsl:value-of select="';'" />
</xsl:if>
</xsl:for-each>
<xsl:value-of select="'
'" />
<xsl:apply-templates select="*/Person" />
</xsl:template>
<xsl:template match="Person">
<xsl:value-of select="concat(Name,';')" />
<xsl:variable name="family" select="FamilyMembers/FamilyMember" />
<xsl:for-each select="$distinct-types">
<xsl:variable name="type" select="string(.)" />
<xsl:value-of select="$family/self::*[Type=$type]/Name" />
<xsl:if test="position() < last()">
<xsl:value-of select="';'" />
</xsl:if>
</xsl:for-each>
<xsl:value-of select="'
'" />
</xsl:template>
</xsl:stylesheet>
...will transform this input (or others efficiently at scale) ...
<t>
<Person>
<Name>John</Name>
<FamilyMembers>
<FamilyMember>
<Name>Lisa</Name>
<Type>Sister</Type>
</FamilyMember>
<FamilyMember>
<Name>Tom</Name>
<Type>Brother</Type>
</FamilyMember>
</FamilyMembers>
</Person>
<Person>
<Name>Daniel</Name>
<FamilyMembers>
<FamilyMember>
<Name>Peter</Name>
<Type>Father</Type>
</FamilyMember>
</FamilyMembers>
</Person>
</t>
... and yield text...
Name;Sister;Brother;Father
John;Lisa;Tom;
Daniel;;;Peter
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With