Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to make wkhtmltopdf work in as many case as possible?

this question on wkhtmltopdf has a specific component and a more general component to it.

generally: i am trying to extract a wide range of webpages into pdf files and i want wkhtmltopdf to work in as many cases as possible. its a pretty good tool but i often meet problems when it couldn't convert webpages. do you guys have a go-to set of flags that you use with wkhtmltopdf?

specifically: for example, a webpage that isn't anything far-out, but i am having problems with is http://gizmodo.com/microsoft-surface-book-review-so-good-i-might-switch-1737680767. when i run wkhtmltopdf without any flags (in Windows), i get the following:

>>wkhtmltopdf http://gizmodo.com/microsoft-surface-book-
review-so-good-i-might-switch-1737680767 blah.pdf
Loading pages (1/6)
Error: Failed loading page http://gizmodo.com/microsoft-surface-book-review-so-g
ood-i-might-switch-1737680767 (sometimes it will work just to ignore this error
with --load-error-handling ignore)
Warning: A finished ResourceObject received a loading progress signal. This migh
t be an indication of an iframe taking too long to load.
Warning: Received createRequest signal on a disposed ResourceObject's NetworkAcc
essManager. This might be an indication of an iframe taking too long to load.
Exit with code 1, due to unknown error.

if i follow the instructions and use the --load-error-handling ignore flag, the PDF file is generated, but its empty. how do i get wkhtmltopdf to work with this webpage?

i tried to look at other tools such as phantomJS with rasterize.js, but it has its own set of problems...

thanks guys!

like image 279
adrianX Avatar asked Oct 20 '25 16:10

adrianX


2 Answers

This happens when Javascript is enabled and it is too slow to complete. If you need to run javascript to solve this problem add:

--javascript-delay 100000  

which adjust the wait time for Javascript to complete (it's in milliseconds). So in the example above it waits for 100 secs. Note if you run a multiple document conversion at once, this setting applies to the whole run, and not to each individual document. Therefore if, say, you convert some 100 input htmls in a single pdf output, you may need a longer delay.

I also add to my scripts:

--no-stop-slow-scripts

which enables: Do not Stop slow running javascripts.

like image 139
stason Avatar answered Oct 23 '25 06:10

stason


Turns out its actually quite simple! simply use the "-n" flag! works like a charm!

like image 35
adrianX Avatar answered Oct 23 '25 05:10

adrianX



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!