Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Scrapy be used with the Chrome Browser?

I need to scrape a web page that is a javascript-rendered AngularJS app. The developers of the site detect Safari/Firefox in private browsing mode and disallow it to be used, and therefore scraped. The page works with Safari/Firefox when you are not in private mode.

The interesting thing is that no such warning is given when using Chrome whether in private mode or not. I was using Scrapy+Selenium, but I was really hoping to use ScrapyJS/Splash for this project. However, it looks like the Scrapy/Splash combination suffers from the website's private browsing wall.

Is it possible to tell Scrapy to use Chrome? I know Selenium has quite a few drivers, and it is pretty well documented on how to use each, but I can't find any info on if Scrapy has support for other browsers or if someone else has already done this. Google/SO searches haven't illuminated this at all for me either.

like image 676
Randy Avatar asked Nov 30 '25 16:11

Randy


1 Answers

Starting from Splash 2.0, you can disable Private mode (which is "on" by default).

There are two ways to go about it:

  • at startup, with the --disable-private-mode argument, e.g., if you're using Docker:

    $ sudo docker run -p 5023:5023 -p 8050:8050 -p 8051:8051 scrapinghub/splash --disable-private-mode
    
  • at runtime when using the /execute endpoint and setting splash.private_mode_enabled=false

Also, take note of the effect of disabling private mode:

Note that if you disable private mode browsing data such as cookies or items kept in local storage may persist between requests.

like image 165
paul trmbrth Avatar answered Dec 02 '25 06:12

paul trmbrth



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!