Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bypassing CAPTCHAs with Headless Chrome using puppeteer

google finds my browser is being manipulated/controlled/automated by software, and because of that I get reCaptcha. When I manual start chromium and do the same steps the reCaptcha doesn't appear.

Question 1)

Is it possible to solve captcha Programmatically or get rid of it when using puppeteer? Any way to solve this?

Question 2)

Does this happens only when without headless option i.e

const browser = await puppeteer.launch({   headless: false }) 

OR this is something the fact we have to accept and move on?

like image 240
rinold simon Avatar asked Apr 14 '19 17:04

rinold simon


People also ask

Can a bot bypass Captcha?

In short, yes they can. While reCAPTCHA v2 and v3 can help limit simple bot traffic, both versions come with several problems: User experience suffers, as human users hate the image/audio recognition challenges. CAPTCHA farms and advances in AI allow cybercriminals and advanced bots to bypass reCAPTCHAs easily.

Is puppeteer a headless browser?

Puppeteer is a Node library which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome or Chromium. An explanation of what Puppeteer is and the things it can do.


1 Answers

Try generating random useragent using this npm package. This usually solves the user agent-based protection.

In puppeteer pages can override browser user agent with page.setUserAgent

var userAgent = require('user-agents'); ... await page.setUserAgent(userAgent.toString()) 

Additionally, you can add these two extra plugins,

puppeteer-extra-plugin-recaptcha - Solves reCAPTCHAs automatically, using a single line of code: page.solveRecaptchas()

NOTE: puppeteer-extra-plugin-recaptcha uses a paid service 2captcha

puppeteer-extra-plugin-stealth - Applies various evasion techniques to make detection of headless puppeteer harder.

like image 81
rinold simon Avatar answered Sep 27 '22 23:09

rinold simon