Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

External resources in Puppeteer with Chrome executable fail to load (net::ERR_EMPTY_RESPONSE)

I'm having issues using external resources in a Puppeteer job that I'm running with a full Chrome executable (not the default Chromium). Any help would be massively appreciated!

So for example, if I load a video with a public URL it fails even though it works fine if I hit it manually in the browser.

const videoElement = document.createElement('video');
videoElement.src = src;
videoElement.onloadedmetadata = function() {
  console.log(videoElement.duration);
};

Here's my Puppeteer call:

(async () => {
  const browser = await puppeteer.launch({
    args: [
      '--remote-debugging-port=9222',
      '--autoplay-policy=no-user-gesture-required',
      '--allow-insecure-localhost',
      '--proxy-server=http://localhost:9000',
      '--proxy-bypass-list=""',
      '--no-sandbox', 
      '--disable-setuid-sandbox',
    ],
    executablePath:
      '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
  });

  const page = await browser.newPage();
  logConsole(page);
  
  await page.goto(`http://${hostname}/${path}`, {
    waitUntil: 'networkidle2',
  });
  
  await page.waitForSelector('#job-complete');
  console.log('Job complete!');

  await browser.close();
})();

Unlike many Puppeteer examples, the issue here isn't that my test doesn't wait long enough. The resources fail to load / return empty responses almost instantly.

It also doesn't appear to be an authentication issue - I reach my own server just fine.

Although I'm not running on https here, the URL I try directly in the browser works without SSL.

I should also mention that this is a React (CRA) website and I'm calling Puppeteer with Node.

I can see that at least 3 other external resources (non-video) also fail. Is there a flag or something I should be using that I'm missing? Thanks so much for any help!

like image 973
milesaron Avatar asked Oct 23 '20 22:10

milesaron


People also ask

Can Puppeteer connect to existing Chrome?

The answer is Puppeteer and its ability to connect to an existing Chrome Window which you've already manually logged into.

Can Puppeteer use Chrome instead of Chromium?

Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

How do I use Puppeteer to open Chrome?

I'd like to add, perhaps what you want is using the package chrome-launcher which takes care of running the chrome browser. You can then use puppeteer. connect() to connect the puppeteer-core library to the browser opened and instrument it. Show activity on this post.

What browsers does Puppeteer support?

It supports:Chrome/Chromium (+ Edge, Opera, Chromium-based browsers) Firefox. Webkit (Safari)


1 Answers

In my case I had to use puppeteer-extra and puppeteer-extra-plugin-stealth:

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

I also found the following flags useful:

const browser = await puppeteer.launch({
    args: [
      '--disable-web-security',
      '--autoplay-policy=no-user-gesture-required',
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--remote-debugging-port=9222',
      '--allow-insecure-localhost',
    ],
    executablePath:
      '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
  });

Finally, I found it necessary in a few cases to bypass CSP:

await page.setBypassCSP(true);

Please be careful using these rather insecure settings 😬

like image 159
milesaron Avatar answered Oct 09 '22 18:10

milesaron