I've noticed a few inconsistencies when trying to use the headerTemplate
and footerTemplate
options with page.pdf
:
I suspect that this happens because headers and footers are treated as separate documents and converted to image/pdf separately (https://cs.chromium.org/chromium/src/components/printing/resources/print_header_footer_template_page.html also implies something like that). Can someone familiar with the implementation explain how it actually works? Thanks!
Click the Effects menu on the Detailed Settings tab. Select the Header/Footer check box. Select the items that you want to print on header and footer, Date and Time, Page Number, and Text.
On the Page Layout tab, in the Page Setup group, click Page Setup. Under Print Titles, click in Rows to repeat at top or Columns to repeat at left and select the column or row that contains the titles you want to repeat. Click OK.
Puppeteer controls Chrome or Chromium over the DevTools Protocol.
Chromium uses Skia for PDF generation.
Skia handles the header, set of objects, and footer separately.
From the Puppeteer Documentation:
page.pdf(options)
options
<Object> Options object which might have the following properties:
headerTemplate
<string> HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them:
date
formatted print datetitle
document titleurl
document locationpageNumber
current page numbertotalPages
total pages in the documentfooterTemplate
<string> HTML template for the print footer. Should use the same format as theheaderTemplate
.- returns: <Promise<Buffer>> Promise which resolves with PDF buffer.
NOTE Generating a pdf is currently only supported in Chrome headless.
NOTE
headerTemplate
andfooterTemplate
markup have the following limitations:
- Script tags inside templates are not evaluated.
- Page styles are not visible inside templates.
We can learn from the the Puppeteer source code for page.pdf()
that:
Page.printToPDF
(along with the headerTemplate
and footerTemplate
parameters) are sent to to page._client
.page._client
is an instance of page.target().createCDPSession()
(a Chrome DevTools Protocol session).From the Chrome DevTools Protocol Viewer, we can see that Page.printToPDF
contains the parameters headerTemplate
and footerTemplate
:
Page.printToPDF
Print page as PDF.
PARAMETERS
headerTemplate
string (optional)
- HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them:
date
: formatted print datetitle
: document titleurl
: document locationpageNumber
: current page numbertotalPages
: total pages in the document- For example,
<span class=title></span>
would generate span containing the title.footerTemplate
string (optional)
- HTML template for the print footer. Should use the same format as the
headerTemplate
.RETURN OBJECT
data
string
- Base64-encoded pdf data.
The Chromium source code for Page.printToPDF
shows us that:
Page.printToPDF
parameters are passed to the sendDevToolsMessage
function, which issues a DevTools protocol command and returns a promise for the results.After further digging, we can see that Chromium has a concrete implementation of a class called SkDocument
that creates PDF files.
SkDocument
comes from the Skia Graphics Library, which Chromium uses for PDF generation.
The Skia PDF Theory of Operation, in the PDF Objects and Document Structure section, states that:
Background: The PDF file format has a header, a set of objects and then a footer that contains a table of contents for all of the objects in the document (the cross-reference table). The table of contents lists the specific byte position for each object. The objects may have references to other objects and the ASCII size of those references is dependent on the object number assigned to the referenced object; therefore we can’t calculate the table of contents until the size of objects is known, which requires assignment of object numbers. The document uses
SkWStream::bytesWritten()
to query the offsets of each object and build the cross-reference table.
The document explains further down:
The PDF backend requires all indirect objects used in a PDF to be added to the
SkPDFObjNumMap
of theSkPDFDocument
. The catalog is responsible for assigning object numbers and generating the table of contents required at the end of PDF files. In some sense, generating a PDF is a three step process. In the first step all the objects and references among them are created (mostly done bySkPDFDevice
). In the second step,SkPDFObjNumMap
assigns and remembers object numbers. Finally, in the third step, the header is printed, each object is printed, and then the table of contents and trailer are printed.SkPDFDocument
takes care of collecting all the objects from the variousSkPDFDevice
instances, adding them to anSkPDFObjNumMap
, iterating through the objects once to set their file positions, and iterating again to generate the final PDF.
Thanks to the other answer (https://stackoverflow.com/a/51460641/364131) and codesearch, I think I found most of the answers I was looking for.
The printing implementation is in PrintPageInternal. It uses two separate WebFrame
s — one to render the content, and one to render the header and footer. The rendering for the header and footer is done by creating a special frame, writing the contents of print_header_and_footer_template_page.html to this frame, calling the setup
function with the options provided and then printing to a shared canvas. After this, the rest of the contents of the page are printed on the same canvas within the bounds defined by the margins.
Headers and footers are scaled by a fudge_factor which isn't applied to the rest of the content. There might be something funny going on here with the DPIs (which might explain the fudge_factor of 1.33333333f
which is equal to 96/72
).
I'm guessing this special frame is what prevents the header and footer from sharing the same resources (styles, fonts etc.) as the contents of the page. It probably isn't setup to load (and wait for) any additional resources requested by the header and footer templates, which is why the requested fonts don't load.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With