Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the browser's (chrome/firefox) HTML/CSS/JS rendering engine to produce PDF?

Tags:

There are nice projects that generate pdf from html/css/js files

  1. http://wkhtmltopdf.org/ (open source)
  2. https://code.google.com/p/flying-saucer/ (open source)
  3. http://cssbox.sourceforge.net/ (not necessarily straight pdf generation)
  4. http://phantomjs.org/ (open source allows for pdf output)
  5. http://www.princexml.com/ (comercial but hands down the best one out there)
  6. https://thepdfapi.com/ a chrome modification to spit pdf from html from

I want to programatically control chrome or firefox browser (because they both are cross platform) to make them load a web page, run the scripts and style the page and generate a pdf file for printing.

But how do I start by controlling the browser in an automated way so that I can do something like

render-to-pdf file-to-render.html out.pdf

I can easily make this job manually by browsing the page and then printing it to pdf and I get an accurate, 100% spec compliant rendered html/css/js page on a pdf file. Even the url headers can be omitted in the pdf through configuration options in the browser. But again, how do I start in trying to automate this process?

I want to automate in the server side, the opening of the browser, navigating to a page, and generating the pdf using the browser rendered page.

I have done a lot of research I just don't know how to make the right question. I want to programatically control the browser, maybe like selenium does but to the point where I export a webpage as PDF (hence using the rendering capabilities of the browser to produce good pdfs)

like image 520
David Hofmann Avatar asked Aug 29 '14 18:08

David Hofmann


2 Answers

I'm not an expert but PhamtomJS seems to be the right tool for the job. I'm not sure though about what headless browser it uses underneath (I guess it is chrome/chromium)

var page = require('webpage').create(); page.open('http://github.com/', function() {      var s = page.evaluate(function() {          var body = document.body,              html = document.documentElement;          var height = Math.max( body.scrollHeight, body.offsetHeight,              html.clientHeight, html.scrollHeight, html.offsetHeight );         var width = Math.max( body.scrollWidth, body.offsetWidth,              html.clientWidth, html.scrollWidth, html.offsetWidth );         return {width: width, height: height}     });      console.log(JSON.stringify(s));      // so it fit ins a single page     page.paperSize = {         width: "1980px",         height: s.height + "px",         margin: {             top: '50px',             left: '20px'         }     };      page.render('github.pdf');     phantom.exit(); }); 

Hope it helps.

like image 180
crodas Avatar answered Nov 01 '22 15:11

crodas


Firefox has an API method for that: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/tabs/saveAsPDF

browser.tabs.saveAsPDF({})   .then((status) => {     console.log('PDF file status: ' + status);   }); 

However, it seems to be available only for Browser Extensions, not to be invoked from a web page.

I'm still looking for a public API for that...

like image 44
Guillermo Gutiérrez Avatar answered Nov 01 '22 16:11

Guillermo Gutiérrez