Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a PDF file from HTML using PDFBox?

Tags:

java

pdf

pdfbox

I am trying to create a PDF from HTML content.

public byte[] generatePdf(final XhtmlPDFGenerationRequest request) {

    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    PDDocument document = new PDDocument();
    InputStream stream = new ByteArrayInputStream(request.getContent()
            .getBytes());

    PDStream pdstream = new PDStream(document, stream);
    document.save(baos);
    document.close();
    return this.toByteArray(baos);

}

When I take this byte[] and save to a file, the file is blank. I am using PDStream to embed the input stream into the document

From the http://pdfbox.apache.org/apidocs/

public PDStream(PDDocument doc,
                InputStream str)
         throws IOException

Reads all data from the input stream and embeds it into the document, this will close the InputStream.

like image 809
vsingh Avatar asked Oct 31 '13 17:10

vsingh


People also ask

Which is better iText or PDFBox?

One major difference is that PDFBox always processes text glyph by glyph while iText normally processes it chunk (i.e. single string parameter of text drawing operation) by chunk; that reduces the required resources in iText quite a lot.

What is the use of PDFBox?

The Apache PDFBox® library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command-line utilities.


2 Answers

I was looking for an HTML to PDF renderer. We were using iText. I was looking to do same with Apache PDFBox. But, it looks like it cannot be done.

I can either use Apache FOP or continue using iText.

Here is the iText solution if anyone is interested: Java Render XML Document as PDF

If you are looking for a solution for merging using PDF box, here it is Merge pdf files using Apache pdf box

like image 120
vsingh Avatar answered Oct 11 '22 01:10

vsingh


Open HTML to PDF library uses PDFBox under the hood and hides all the conversion complexity.

Usage is quite simple:

try (OutputStream os = new FileOutputStream("/Users/me/output.pdf")) {
    PdfRendererBuilder builder = new PdfRendererBuilder();
    builder.withUri("file:////Users/me/input.html");
    builder.toStream(os);
    builder.run();
}
like image 36
Andrey Avatar answered Oct 11 '22 02:10

Andrey