Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy shell with playwright

Is it possible to invoke Playwright in a Scrapy shell?

I would like to use a shell to test my xpaths, which I intend to place in a spider that incorporates Scrapy Playwright.

My scrapy settings file has the usual Playwright setup:

# Scrapy Playwright Setup
DOWNLOAD_HANDLERS = {
    "http": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
    "https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
}

TWISTED_REACTOR = "twisted.internet.asyncioreactor.AsyncioSelectorReactor"
like image 587
Clemence Mungwariri Avatar asked Oct 26 '25 02:10

Clemence Mungwariri


2 Answers

Yes, It is possible. In fact, all you have to do is just running scrapy shell inside a folder that contains a scrapy project. It will automatically load all the default settings from settings.py. You can see it on the logs when running scrapy shell.

Also, You can override settings using the -s parameters.

scrapy shell -s DOWNLOAD_HANDLERS='<<your custom handlers>>' 

Happy Scraping :)

like image 118
Neha Setia Nagpal Avatar answered Oct 28 '25 03:10

Neha Setia Nagpal


I had the same issue. In addition to the Playwright configuration you have in your settings.py, and running your shell from within that scrapy project, I had to pass a kwarg to fetch after starting the shell, like this:

scrapy shell
fetch('<url-of-request>', meta={"playwright": True})

You can then run commands as you normally would in scrapy shell, such as:

view(response)
like image 37
Trever Avatar answered Oct 28 '25 02:10

Trever



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!