Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get the raw download size of a request using Puppeteer?

That is, the total amount of data downloaded across all resources (including video/media), similar to that returned by Chrome DevTools' Network tab.

like image 389
mjs Avatar asked Jan 15 '18 12:01

mjs


2 Answers

There doesn't seem to be any way to do this as of January 2018 that works with all resource types (listening for the response event fails for videos), and that correctly counts compressed resources.

The best workaround seems to be to listen for the Network.dataReceived event, and process the event manually:

const resources = {};
page._client.on('Network.dataReceived', (event) => {
  const request = page._networkManager._requestIdToRequest.get(
    event.requestId
  );
  if (request && request.url().startsWith('data:')) {
    return;
  }
  const url = request.url();
  // encodedDataLength is supposed to be the amount of data received
  // over the wire, but it's often 0, so just use dataLength for consistency.
  // https://chromedevtools.github.io/devtools-protocol/tot/Network/#event-dataReceived
  // const length = event.encodedDataLength > 0 ?
  //     event.encodedDataLength : event.dataLength;
  const length = event.dataLength;
  if (url in resources) {
    resources[url] += length;
  } else {
    resources[url] = length;
  }
});

// page.goto(...), etc.

// totalCompressedBytes is unavailable; see comment above
const totalUncompressedBytes = Object.values(resources).reduce((a, n) => a + n, 0);
like image 129
mjs Avatar answered Sep 22 '22 12:09

mjs


If you are using puppeteer, you have server side node... Why not pipe the request through a stream, or streams and then calculate the content size?

Also there is https://github.com/watson/request-stats

Also you may want to call page.waitForNavigation as you may be wrestling with async timing issues

like image 25
rexfordkelly Avatar answered Sep 19 '22 12:09

rexfordkelly