Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get Puppeteer PDF generation to match HTML document exactly, in regards to page breaks?

I am using Puppeteer to generate PDF files, using static HTML as the source:

const page = await browser.newPage();
await page.setContent(html); //html is read in from the file system

const pdf = await page.pdf({
    format: 'A4',
    printBackground: true,
    preferCSSPageSize: true
});

The same HTML is also shown to front-end users of my application, so they can get an exact preview of the content, before downloading the PDF.

To match the size of an A4 piece of paper, I am using CSS to set the <body> tag of the HTML to a certain width and height, accounting for page margins in the process.

So for example, my CSS may look like this:

@page {
    margin: 1cm; //tells Puppeteer to print the PDF with a 1cm margin
}

body {
    width: 19cm; // (21cm width minus 1cm margin on each side)
    height: 27.7cm // (29.7cm height minus 1cm margin top and bottom)
}

The issue I am facing is with regard to page breaks; Puppeteer sometimes splits the bottom content into separate pages.

For example, this is what the HTML looks like, for the bottom of the A4 page representation that the front-end user sees.

enter image description here

As you can see, there is clearly enough space for the bottom row of text to fit, it is not being cut off.

However, Puppeteer prints the PDF like so:

enter image description here

i.e. it splits the text into two separate pages.

This behavior also seems to be very erratic; I have noticed at times (e.g. with different text/paragraph lengths), it doesn't split the content into separate pages.

Do you have any idea as to why Puppeteer is splitting the text? I have gone through the documentation but cannot seem to find any solutions for this.

Thanks!

like image 990
badcoder Avatar asked Jan 03 '21 00:01

badcoder


People also ask

How do I create a PDF from HTML?

On a Windows computer, open an HTML web page in Internet Explorer, Google Chrome, or Firefox. On a Mac, open an HTML web page in Firefox. Click the “Convert to PDF” button in the Adobe PDF toolbar to start the PDF conversion. Enter a file name and save your new PDF file in a desired location.

How do you put page numbers on a puppeteer?

options there are totalPages and pageNumber options. const puppeteer = require('puppeteer'); class Webpage { static async generatePDF(url) { const browser = await puppeteer. launch({ headless: true }); // Puppeteer can only generate pdf in headless mode. const page = await browser.


1 Answers

The problem is that there is a mismatch between your CSS settings for the page-size, and the A4 pagesize that chrome is using to print.

Have a look at the following question/answer and specifically the CSS settings in the approved answer.

CSS to set A4 paper size

The proposed solution is to make use also of the print media rule.

They have a specific demo with the following code:

@page {
  size: A4;
  margin: 0;
}
@media print {
  html, body {
    width: 210mm;
    height: 297mm;
  }
  /* ... the rest of the rules ... */
}

I modified their demo slightly to include your Lorem Ipsum bulleted text. You can view it @ http://jsfiddle.net/x7s2cntj/1/ .

Click run to see the result, or try it in headless chrome using puppeteer.

I removed the snippet from stack overflow because it seems some additional CSS is being applied inside the snippet window.

like image 157
Menelaos Avatar answered Oct 29 '22 05:10

Menelaos