Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any way improve the performance of FlyingSaucer?

I've followed this article to use FlyingSaucer to convert XHTML to PDF and it's brilliant but has one major downfall... it's ridiculously slow!

I'm finding that it takes between 1 and 2 minutes to render a PDF from an XHTML, regardless of how simple that page is.

Basic code:

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import org.xhtmlrenderer.pdf.ITextRenderer;
import com.lowagie.text.DocumentException;

public class FirstDoc {

    public static void main(String[] args) throws IOException, DocumentException {

        String inputFile = "firstdoc.xhtml";
        String url = new File(inputFile).toURI().toURL().toString();
        String outputFile = "firstdoc.pdf";
        OutputStream os = new FileOutputStream(outputFile);

        ITextRenderer renderer = new ITextRenderer();
        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);

        os.close();
    }
}

Sample XHTML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title>My First Document</title>
        <style type="text/css"> b { color: green; } </style>
    </head>
    <body>
        <p>
            <b>Greetings Earthlings!</b>
            We've come for your Java.
        </p>
    </body>
</html>

Does anyone know how to improve the performance of FlyingSaucer?

Failing that, is anyone able to recommend an alternative Java library which is effective at rendering a PDF from a URL to an (X)HTML document with external CSS and images generated from URLs?

like image 373
Edd Avatar asked Nov 29 '22 18:11

Edd


1 Answers

I was facing the same problem as Edd.

Sadly the next approach didn't work Java DocumentBuilder: xml parsing is very slow? by Marek Piechut completely for me - my HTML entities got lost on the way.

DocumentBuilderFactory fac = DocumentBuilderFactory.newInstance();
fac.setNamespaceAware(false);
fac.setValidating(false);
fac.setFeature("http://xml.org/sax/features/namespaces", false);
fac.setFeature("http://xml.org/sax/features/validation", false);
fac.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
fac.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder builder = fac.newDocumentBuilder();

What finally did the trick were these lines:

DocumentBuilderFactory fac = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = fac.newDocumentBuilder();
builder.setEntityResolver(FSEntityResolver.instance());

By using the built-in Java EntityResolver for resolving the DTD it got faster tremendously.

like image 56
Mister Henson Avatar answered Dec 04 '22 06:12

Mister Henson