Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient XSLT pipeline in Java (or redirecting Results to Sources)

Tags:

I have a series of XSL 2.0 stylesheets that feed into each other, i.e. the output of stylesheet A feeds B feeds C.

What is the most efficient way of doing this? The question rephrased is: how can one efficiently route the output of one transformation into another.

Here's my first attempt:

@Override
public void transform(Source data, Result out) throws TransformerException{
    for(Transformer autobot : autobots){
        if(autobots.indexOf(autobot) != (autobots.size()-1)){
            log.debug("Transforming prelim stylesheet...");
            data = transform(autobot,data);
        }else{
            log.debug("Transforming final stylesheet...");
            autobot.transform(data, out);
        }
    }
}

private Source transform(Transformer autobot, Source data) throws TransformerException{
    DOMResult result = new DOMResult();
    autobot.transform(data, result);
    Node node = result.getNode();
    return new DOMSource(node);
}

As you can see, I'm using a DOM to sit in between transformations, and although it is convenient, it's non-optimal performance wise.

Is there any easy way to route to say, route a SAXResult to a SAXSource? A StAX solution would be another option.

I'm aware of projects like XProc, which is very cool if you haven't taken a look at yet, but I didn't want to invest in a whole framework.

like image 978
Chris Scott Avatar asked Aug 21 '09 14:08

Chris Scott


People also ask

What are the two 2 inputs of an XSLT processor?

The XSLT processor operates on two inputs: the XML document to transform, and the XSLT stylesheet that is used to apply transformations on the XML. Each of these two can actually be multiple inputs.

What is the output of an XSLT processor?

An XSLTProcessor applies an XSLT stylesheet transformation to an XML document to produce a new XML document as output.

What is XSLT in Java?

Using the XSLT Processor for Java: Overview. The Oracle XDK XSLT processor is a software program that transforms an XML document into another text-based format. For example, the processor can transform XML into XML, HTML, XHTML, or text.


2 Answers

I found this: #3. Chaining Transformations that shows two ways to use the TransformerFactory to chain transformations, having the results of one transform feed the next transform and then finally output to system out. This avoids the need for an intermediate serialization to String, file, etc. between transforms.

When multiple, successive transformations are required to the same XML document, be sure to avoid unnecessary parsing operations. I frequently run into code that transforms a String to another String, then transforms that String to yet another String. Not only is this slow, but it can consume a significant amount of memory as well, especially if the intermediate Strings aren't allowed to be garbage collected.

Most transformations are based on a series of SAX events. A SAX parser will typically parse an InputStream or another InputSource into SAX events, which can then be fed to a Transformer. Rather than having the Transformer output to a File, String, or another such Result, a SAXResult can be used instead. A SAXResult accepts a ContentHandler, which can pass these SAX events directly to another Transformer, etc.

Here is one approach, and the one I usually prefer as it provides more flexibility for various input and output sources. It also makes it fairly easy to create a transformation chain dynamically and with a variable number of transformations.

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

// These templates objects could be reused and obtained from elsewhere.
Templates templates1 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));

TransformerHandler th1 = stf.newTransformerHandler(templates1);
TransformerHandler th2 = stf.newTransformerHandler(templates2);

th1.setResult(new SAXResult(th2));
th2.setResult(new StreamResult(System.out));

Transformer t = stf.newTransformer();
t.transform(new StreamSource(System.in), new SAXResult(th1));

// th1 feeds th2, which in turn feeds System.out.
like image 60
Mads Hansen Avatar answered Sep 27 '22 16:09

Mads Hansen


Related question Efficient XSLT pipeline, with params, in Java clarified on correct parameters passing to such transformer chain.

And it also gave a hint on slightly shorter solution without third transformer:

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

Templates templates1 = stf.newTemplates(new StreamSource(
        getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
        getClass().getResourceAsStream("MyStylesheet2.xslt")));

TransformerHandler th1 = stf.newTransformerHandler(templates1);
TransformerHandler th2 = stf.newTransformerHandler(templates2);

th2.setResult(new StreamResult(System.out));

// Note that indent, etc should be applied to the last transformer in chain:
th2.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes");

th1.getTransformer().transform(new StreamSource(System.in), new SAXResult(th2));
like image 21
Vadzim Avatar answered Sep 27 '22 18:09

Vadzim