Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it safe to run multiple instances of Puppeteer at the same time?

Tags:

Is it safe/supported to run multiple instances of Puppeteer at the same time, either at

  1. the process level (multiple node screenshot.js at the same time) or
  2. at the script level (multiple puppeteer.launch() at the same time)?

What are the recommended settings/limits on parallel processes?

(In my tests, (1) seems to work fine, but I'm wondering about the reliability of Puppeteer's interactions with the single (?) instance of Chrome. I haven't tried (2) but that seems less likely to work out.)

like image 663
mjs Avatar asked Jan 18 '18 09:01

mjs


People also ask

How much RAM does puppeteer need?

Memory requirements Actors using Puppeteer: at least 1GB of memory. Large and complex sites like Google Maps: at least 4GB for optimal speed and concurrency.

Is puppeteer a headless browser?

Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol.

Is puppeteer headless by default?

Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.


1 Answers

It's fine to run multiple browser, contexts or even pages in parallel. The limits depend on your network/disk/memory and task setup.

I crawled a few million pages and from time to time (in my setup, every ~10,000 pages) puppeteer will crash. Therefore, you should have a way to auto-restart the browser and retry the job.

You might want to check out puppeteer-cluster, which takes care of pooling the browser instances, restarting and crash detection/restarting. (Disclaimer: I'm the author)

An example of a creation of a cluster is below:

// create a cluster that handles 10 parallel browsers const cluster = await Cluster.launch({     concurrency: Cluster.CONCURRENCY_BROWSER,     maxConcurrency: 10, });  // Queue your jobs (one example) cluster.queue(async ({ page }) => {     await page.goto('http://www.wikipedia.org');     await page.screenshot({path: 'wikipedia.png'}); }); 

This is just a minimal example. There are many more ways to use the cluster.

like image 190
Thomas Dondorf Avatar answered Sep 18 '22 12:09

Thomas Dondorf