I have a requirement of transforming a huge XML document into multiple HTML documents. The XML is as follows:
<society>
<party_members>
<member id="1" first_name="" last_name="O'Brien">
<ministry_id>1</ministry_id>
<ministry_id>3</ministry_id>
</member>
<member id="2" first_name="Julia" last_name="">
<ministry_id>2</ministry_id>
</member>
<member id="3" first_name="Winston" last_name="Smith">
<ministry_id>1</ministry_id>
</member>
</party_members>
<ministries>
<ministry>
<id>1</id>
<short_title>Minitrue</short_title>
<long_title>Ministry of truth</long_title>
<concerns>News, entertainment,education and arts </concerns>
</ministry>
<ministry>
<id>2</id>
<short_title>Minipax</short_title>
<long_title>Ministry of Peace</long_title>
<concerns>War</concerns>
</ministry>
<ministry>
<id>3</id>
<short_title>Minilove</short_title>
<long_title>Ministry of Love</long_title>
<concerns>Dissidents</concerns>
</ministry>
</ministries>
</society>
Where potential number of party members can be quite large - millions, and number of ministries is small, around 300-400. For each of the party member there should be an output HTML with following content:
<html>
<body>
<h2>Party member: Winston Smith</h2>
<h3>Works in:</h3>
<div class="ministry">
<h4>Ministry of truth</h4> - Minitrue
<h5>Ministry of truth <i>concerns</i> itself with <i>News, entertainment,education and arts</i></h5>
</div>
</body>
</html>
The number of output documents should == number of party members.
I'm now struggling with XSLT, but can't get it to work.
Please help me decide if XSLT is a good tool for this job, if it is, hint me as if how to implement it, what XSLT constructs should be used, etc.
Of course I could simply write mini transformation in a procedural language, but I'm looking for a 'apply transformation template' approach, rather than procedural parsing and modification to be able to hand the template to other users for further modifications (CSS, formatting etc).
I'm using ruby + nokogiri(which is a set of bindings to libxslt), but it is possible to use any language.
If XSTL is a bad fit for this task, what other instruments can be used here, provided I must transform ~1M of users in several minutes with small memory consumption?
Additional benefit would be to be able to parallelize the processing.
Thank you.
Well with pure XSLT 1.0 you can't create multiple result documents with a single transformation which you seem to want to do. For that you need to use an XSLT 2.0 processor like Saxon 9 or AltovaXML with the XSLT 2.0 instruction [xsl:result-document][1] or you need to use an XSLT 1.0 processor like xsltproc/libxslt which implements http://www.exslt.org/exsl/elements/document/index.html. If you can use one of them then XSLT is well suited for your task.
[edit] With libxslt respectively xsltproc the following stylesheet code
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl"
extension-element-prefixes="exsl"
version="1.0">
<xsl:output method="html" indent="yes"/>
<xsl:key name="ministry-by-id" match="ministry" use="id"/>
<xsl:template match="/">
<xsl:apply-templates select="society/party_members/member" mode="doc"/>
</xsl:template>
<xsl:template match="member" mode="doc">
<exsl:document href="member{@id}.xml">
<html>
<body>
<h2>Party member: <xsl:value-of select="concat(@first_name, ' ', @last_name)"/></h2>
<h3>Works in</h3>
<xsl:apply-templates select="key('ministry-by-id', ministry_id)"/>
</body>
</html>
</exsl:document>
</xsl:template>
<xsl:template match="ministry">
<div class="ministry">
<h4><xsl:value-of select="long_title"/></h4>
<h5><xsl:value-of select="long_title"/> <i>concerns</i> itself with <i><xsl:value-of select="concerns"/></i></h5>
</div>
</xsl:template>
</xsl:stylesheet>
shows how to use exsl:document to output several result documents with one transformation. It also uses a key to improve performance. Let us know whether that code works for your huge input data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With