Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Puppeteer wait until page is completely loaded

I am working on creating PDF from web page.

The application on which I am working is single page application.

I tried many options and suggestion on https://github.com/GoogleChrome/puppeteer/issues/1412

But it is not working

    const browser = await puppeteer.launch({     executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',     ignoreHTTPSErrors: true,     headless: true,     devtools: false,     args: ['--no-sandbox', '--disable-setuid-sandbox'] });  const page = await browser.newPage();  await page.goto(fullUrl, {     waitUntil: 'networkidle2' });  await page.type('#username', 'scott'); await page.type('#password', 'tiger');  await page.click('#Login_Button'); await page.waitFor(2000);  await page.pdf({     path: outputFileName,     displayHeaderFooter: true,     headerTemplate: '',     footerTemplate: '',     printBackground: true,     format: 'A4' }); 

What I want is to generate PDF report as soon as Page is loaded completely.

I don't want to write any type of delays i.e. await page.waitFor(2000);

I can not do waitForSelector because the page has charts and graphs which are rendered after calculations.

Help will be appreciated.

like image 840
n.sharvarish Avatar asked Sep 25 '18 11:09

n.sharvarish


People also ask

How do you wait full page load in puppeteer?

goto with the url and { waitUntil: "domcontentloaded" } to navigate to the url and wait until the whole page is loaded by setting waitUntil to 'domcontentloaded' .

What is Page on in puppeteer?

The Puppeteer page class extends Node. js's native EventEmitter , which means that whenever you call page. on() , you are setting up an event listener using Node.

How long does puppeteer wait for events to resolve?

By default, it is 30000 milliseconds — 30 seconds. If events are not resolved within this time, page.goto () will throw an error. With options, Puppeteer waits for the network idle.

How to change the default page timeout in puppeteer?

page.waitForSelector () and page.waitForXPath () is used to wait for an element. By default, the timeout is 30 sec in puppeteer. But we can change it according to the requirement. page.setDefaultTimeout (ms) is used to change the default timeout. page.setDefaultNavigationTimeout (ms) is used to change the navigation timeout.

What is explicit Wait in puppeteer?

Explicit wait or static wait is used to wait for a fixed or static time frame. for example, I set explicit wait like 50 sec. it means it will wait 50 sec before executing the next line of code. page.waitFor (time in ms) function is used to perform explicit wait in puppeteer.

Is your content fully loaded in puppeteer?

If you have been using Puppeteer on your own you will know it comes with lots of intricacies, and pain points. One of those is ensuring all of your content is fully loaded before outputting your result being a PDF or an Image. What is the best way to ensure your content is all there?


2 Answers

You can use page.waitForNavigation() to wait for the new page to load completely before generating a PDF:

await page.goto(fullUrl, {   waitUntil: 'networkidle0', });  await page.type('#username', 'scott'); await page.type('#password', 'tiger');  await page.click('#Login_Button');  await page.waitForNavigation({   waitUntil: 'networkidle0', });  await page.pdf({   path: outputFileName,   displayHeaderFooter: true,   headerTemplate: '',   footerTemplate: '',   printBackground: true,   format: 'A4', }); 

If there is a certain element that is generated dynamically that you would like included in your PDF, consider using page.waitForSelector() to ensure that the content is visible:

await page.waitForSelector('#example', {   visible: true, }); 
like image 78
Grant Miller Avatar answered Sep 19 '22 00:09

Grant Miller


Sometimes the networkidle events do not always give an indication that the page has completely loaded. There could still be a few JS scripts modifying the content on the page. So watching for the completion of HTML source code modifications by the browser seems to be yielding better results. Here's a function you could use -

const waitTillHTMLRendered = async (page, timeout = 30000) => {   const checkDurationMsecs = 1000;   const maxChecks = timeout / checkDurationMsecs;   let lastHTMLSize = 0;   let checkCounts = 1;   let countStableSizeIterations = 0;   const minStableSizeIterations = 3;    while(checkCounts++ <= maxChecks){     let html = await page.content();     let currentHTMLSize = html.length;       let bodyHTMLSize = await page.evaluate(() => document.body.innerHTML.length);      console.log('last: ', lastHTMLSize, ' <> curr: ', currentHTMLSize, " body html size: ", bodyHTMLSize);      if(lastHTMLSize != 0 && currentHTMLSize == lastHTMLSize)        countStableSizeIterations++;     else        countStableSizeIterations = 0; //reset the counter      if(countStableSizeIterations >= minStableSizeIterations) {       console.log("Page rendered fully..");       break;     }      lastHTMLSize = currentHTMLSize;     await page.waitFor(checkDurationMsecs);   }   }; 

You could use this after the page load / click function call and before you process the page content. e.g.

await page.goto(url, {'timeout': 10000, 'waitUntil':'load'}); await waitTillHTMLRendered(page) const data = await page.content() 
like image 34
Anand Mahajan Avatar answered Sep 18 '22 00:09

Anand Mahajan