Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set splash timeout in scrapy-splash?

I use scrapy-splash to crawl web page, and run splash service on docker.

commond:

docker run -p 8050:8050 scrapinghub/splash --max-timeout 3600

But I got a 504 error.

"error": {"info": {"timeout": 30}, "description": "Timeout exceeded rendering page", "error": 504, "type": "GlobalTimeoutError"}

Although I try to add splash.resource_timeout, request:set_timeout or SPLASH_URL = 'http://localhost:8050?timeout=1800.0', nothing changed.

Thanks for help.

like image 435
Jhon Smith Avatar asked Jun 19 '17 10:06

Jhon Smith


1 Answers

I use scrapy-splash package and set the timeout in args parameter of SplashRequest like this:

yield scrapy_splash.SplashRequest(
    url, self.parse, endpoint='execute',
    args={'lua_source': script, 'timeout': 3600})

It works for me.

like image 128
Tomáš Linhart Avatar answered Oct 18 '22 03:10

Tomáš Linhart