Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Puppeteer error: Navigation failed because browser has disconnected

I am using puppeteer on Google App Engine with Node.JS

whenever I run puppeteer on app engine, I encounter an error saying

Navigation failed because browser has disconnected!

This works fine in local environment, so I am guessing it is a problem with app engine.

const browser = await puppeteer.launch({
    ignoreHTTPSErrors: true,
    headless: true,
    args: ["--disable-setuid-sandbox", "--no-sandbox"],
});

This is my app engine's app.yaml file

runtime: nodejs12

env: standard

handlers:
  - url: /.*
    secure: always
    script: auto

-- EDIT--

It works when I add --disable-dev-shm-usage argument, but then it always timeouts. Here are my codes.

const browser = await puppeteer.launch({
  ignoreHTTPSErrors: true,
  headless: true,
  args: [
    "--disable-gpu",
    "--disable-dev-shm-usage",
    "--no-sandbox",
    "--disable-setuid-sandbox",
    "--no-first-run",
    "--no-zygote",
    "--single-process",
  ],
});
const page = await browser.newPage();

try {
  const url = "https://seekingalpha.com/market-news/1";
  const pageOption = {
    waitUntil: "networkidle2",
    timeout: 20000,
  };

  await page.goto(url, pageOption);
} catch (e) {
  console.log(e);
  await page.close();
  await browser.close();
  return resolve("error at 1");
}

try {
  const ulSelector = "#latest-news-list";
  await page.waitForSelector(ulSelector, { timeout: 30000 });
} catch (e) {
  // ALWAYS TIMEOUTS HERE!
  console.log(e);
  await page.close();
  await browser.close();
  return resolve("error at 2");
}
...
like image 705
HumbleCoder Avatar asked Jul 14 '20 09:07

HumbleCoder


2 Answers

It seems the problem was app engine's memory capacity.

When memory is not enough to deal with puppeteer crawling,

It automatically generates another instance.

However, newly created instance has a different puppeteer browser.

Therefore, it results in Navigation failed because browser has disconnected.

The solution is simply upgrade the app engine instance so it can deal with the crawling job by a single instance.

default instance is F1, which has 256M of memory, so I upgraded to F4, which has 1GB of memery, then it doesn't show an error message anymore.

runtime: nodejs12

instance_class: F4

handlers:
  - url: /.*
    secure: always
    script: auto
like image 127
HumbleCoder Avatar answered Oct 17 '22 12:10

HumbleCoder


For me the error was solved when I stopped using the --use-gl=swiftshader arg.

It is used by default if you use args: chromium.args from chrome-aws-lambda

like image 43
Calypso Avatar answered Oct 17 '22 13:10

Calypso