Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting service unavailable error in scrapy crawling

Tags:

python

scrapy

I am trying to crawl a forum website with scrapy. The crawler works fine if I have

CONCURRENT_REQUESTS = 1

But if I increase that number then I get this error

2012-12-21 05:04:36+0800 [working] DEBUG: Retrying http://www.example.com/profile.php?id=1580> (failed 1 times): 503 Service Unavailable

I want to know if the forum is blocking the request or there is some settings problem.

like image 353
user19140477031 Avatar asked Feb 19 '23 03:02

user19140477031


1 Answers

HTTP status code 503, "Service Unavailable", means that (for some reason) the server wasn't able to process your request. It's usually a transient error. I you want to know if you have been blocked, just try again in a little while and see what happens.

It could also mean that you're fetching pages too quickly. The fix is not to do this by keeping concurrent requests at 1 (and possibly adding a delay). Be polite.

And you will encounter various errors if you are scraping a enough. Just make sure that your crawler can handle them.

like image 148
root Avatar answered Feb 20 '23 17:02

root