I wish to convert an HTML file into a PDF file, using R.
Is there a command, or a combination of tools/commands, that can perform this conversion?
Update: if you have Pandoc installed, you can use something like
html_to_pdf <- function(html_file, pdf_file) {
cmd <- sprintf("pandoc %s -t latex -o %s", html_file, pdf_file)
system(cmd)
}
There are a few web services that do HTML to PDF conversion and have REST APIs so you can call them with RCurl
. A quick internet search gives pdfcrowd.com. They let you upload documents as well as converting URL, but it's a paid for service.
Next hit is joliprint, which is free. Try this:
library(RCurl)
url_to_convert <- curlEscape("http://lifehacker.com/5706937/dont-make-important-decisions-until-your-decision-time") #or wherever
the_pdf <- getForm(
"http://eu.joliprint.com/api/rest/url/print",
url = url_to_convert
)
wkhtmltopdf is a nice cross-platform tool for this. Install as appropriate for your operating system, then call from R e.g.
system("wkhtmltopdf --javascript-delay 1 in.html out.pdf")
I found the javascript delay necessary to avoid the message "Loading [Contrib]/a11y/accessibility-menu.js" being included in the pdf as a result of loading MathJax - which HTML files generated by R markdown will do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With