Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert a HTML web page into a PDF file using Java

Tags:

java

pdf

i've been searching on the internet on how to convert a HTML page into a PDF file using Java. i found a lot of pointers, and in short, they don't work or are too difficult to implement. i also downloaded a commercial product, pdf4ml; the API is something i'd be happy to work with, except that when i crawled a simple page on wikipedia, i get a out of memory error (setting Xmx to 1024 M). in some approaches, they suggest converting HTML -> XHTML -> FO -> PDF. however, i am getting a lot of exceptions for the XHTML-to-FO XLS file; and reading the documentations, it's not something that i have enough time to understand right now.

here are my questions/concerns. 1. is there another cohesive API out there that will easily convert HTML to PDF (commercial or not)? 2. is there a way i can simply capture a HTML page and store it as a single file. this approach would be similar to using internet explorer's way of saving a web page as a web archive (single file, MHT format)?

any help is appreciated. (btw, i know this question has been asked repeatedly, but in addition to the original spirit of the question, i'm opened to other ways). thanks.

like image 225
jake Avatar asked Oct 24 '25 20:10

jake


1 Answers

Try wkhtmltopdf, which is using WebKit. Another option (I'm using that currently) is using OpenOffice (remote controlled via macros).

like image 187
Thomas Mueller Avatar answered Oct 26 '25 11:10

Thomas Mueller