Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to get puppeteer's waitUntil "networkidle" to only consider XHR (ajax) requests?

I am using puppeteer to evaluate the javascript-based HTML of web pages in my test app.

This is the line I am using to make sure all the data is loaded:

await page.setRequestInterception(true);
page.on("request", (request) => {
  if (request.resourceType() === "image" || request.resourceType() === "font" || request.resourceType() === "media") {
    console.log("Request intercepted! ", request.url(), request.resourceType());
    request.abort();
  } else {
    request.continue();
  }
});
try {
  await page.goto(url, { waitUntil: ['networkidle0', 'load'], timeout: requestCounterMaxWaitMs });
} catch (e) {

}

Is this the best way to wait for ajax requests to be completed?

It feels right but I'm not sure if I should use networkidle0, networkidle1, etc?

like image 432
Nicholas DiPiazza Avatar asked Mar 28 '18 15:03

Nicholas DiPiazza


2 Answers

You can use pending-xhr-puppeteer, a lib that expose a promise awaiting that all the pending xhr requests are resolved.

Use it like this :

const puppeteer = require('puppeteer');
const { PendingXHR } = require('pending-xhr-puppeteer');

const browser = await puppeteer.launch({
  headless: true,
  args,
});

const page = await browser.newPage();
const pendingXHR = new PendingXHR(page);
await page.goto(`http://page-with-xhr`);
// Here all xhr requests are not finished
await pendingXHR.waitForAllXhrFinished();
// Here all xhr requests are finished

DISCLAIMER: I am the maintener of pending-xhr-puppeteer

like image 83
Julien TASSIN Avatar answered Sep 30 '22 19:09

Julien TASSIN


XHR by their nature can appear later in the app. Any networkidle0 will not help you if app sends XHR after for example 1 second and you want to wait for it. I think if you want to do this "properly" you should know what requests you are waiting for and await for them.

Here is an example with XHRs occurred later in the app and it wait for all of them:

const puppeteer = require('puppeteer');

const html = `
<html>
  <body>
    <script>
      setTimeout(() => {
        fetch('https://swapi.co/api/people/1/');
      }, 1000);

      setTimeout(() => {
        fetch('https://www.metaweather.com/api/location/search/?query=san');
      }, 2000);

      setTimeout(() => {
        fetch('https://api.fda.gov/drug/event.json?limit=1');
      }, 3000);
    </script>
  </body>
</html>`;

// you can listen to part of the request
// in this example I'm waiting for all of them
const requests = [
    'https://swapi.co/api/people/1/',
    'https://www.metaweather.com/api/location/search/?query=san',
    'https://api.fda.gov/drug/event.json?limit=1'
];

const waitForRequests = (page, names) => {
  const requestsList = [...names];
  return new Promise(resolve =>
     page.on('request', request => {
       if (request.resourceType() === "xhr") {
         // check if request is in observed list
         const index = requestsList.indexOf(request.url());
         if (index > -1) {
           requestsList.splice(index, 1);
         }

         // if all request are fulfilled
         if (!requestsList.length) {
           resolve();
         }
       }
       request.continue();
     })
  );
};


(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setRequestInterception(true);

  // register page.on('request') observables
  const observedRequests = waitForRequests(page, requests);

  // await is ignored here because you want to only consider XHR (ajax) 
  // but it's not necessary
  page.goto(`data:text/html,${html}`);

  console.log('before xhr');
  // await for all observed requests
  await observedRequests;
  console.log('after all xhr');
  await browser.close();
})();
like image 39
Everettss Avatar answered Sep 30 '22 20:09

Everettss