We are intending to use wkhtmltopdf to convert html to pdf but we are concerned about the scalability of wkhtmltopdf. Does anyone have any idea how it scales? Our web app potentially could attempt to convert hundreds of thousands of (reletively complex)html so it's important for us to have some idea. Has anyone got any information on this?
Wkhtmltopdf is an open source simple and much effective command-line shell utility that enables user to convert any given HTML (Web Page) to PDF document or an image (jpg, png, etc).
Is wkhtmltopdf safe to use? While scanning the latest version of wkhtmltopdf, we found that a security review is needed. A total of 0 vulnerabilities or license issues were detected. See the full security scan results.
Another (I would say easiest) means of debugging javascript in WKHTMLTOPDF is to download QT Browser, the underlying browser used by WKHTMLTOPDF, and to inspect the javascript execution for your page from within the browser. You can basically debug your JavaScript in QT Browser just as you would in Chrome or Firefox.
wkhtmltopdf is an open source command-line tool that can save web pages as a PDF or an image - gHacks Tech News wkhtmltopdf is an open source command-line tool that can save web pages as a PDF or an image by Ashwin on December 09, 2020 in Software, Windows software - Last Update: December 15, 2020 - 28 comments
These things are: Wkhtmltopdf is not very fault tolerant. For example, if it cannot find an image (or other resource such as .js or .css file), it doesn't create the PDF file. It simply fails and writes the error message to STDERR. The error messages of Wkhtmltopdf can be quite long and a bit messy.
I used a simple single threaded script written in PHP, to iterate over the folders and pass the html file path to wkhtmltopdf. The process took about 2.5 days to convert all the files, with very minimal errors. I hope this gives you insight to what you can expect from utilizing wkhtmltopdf in your web application.
You can create own pool of the wkhtmltopdf engines. I did it for a simple use case by invoking API directly instead of start process wkhtmltopdf.exe every time. The wkhtmltopdf API is not thread-safe, so it's not easy to do.
First of all, your question is quite general; there are many variables to consider when asking about scalability of any project. Obviously there is a difference between converting "hundreds of thousands" of HTML files over a week and expecting to do that in a day, or an hour. On top of that "relatively complex" HTML can mean different things to other people.
That being said, I figured since I have done something similar to this, converting approximately 450,000 html files, utilizing wkhtmltopdf; I'd share my experience.
Here was my scenario:
I used a simple single threaded script written in PHP, to iterate over the folders and pass the html file path to wkhtmltopdf. The process took about 2.5 days to convert all the files, with very minimal errors.
I hope this gives you insight to what you can expect from utilizing wkhtmltopdf in your web application. Some obvious improvements would come from running this on better hardware but mainly from utilizing a multi-threaded application to process files simultaneously.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With