I can not find a solution to the following problem. I am using Scrapy (latest version) and am trying to debug a spider.
Using scrapy shell https://jigsaw.w3.org/HTTP/300/301.html
-> it does not follow the redirect ( it is using a default spider to get the data). If I am running my spider it follows the 301 - but I can not debug.
How can you make the shell to follow the 301 to allow one to debug the final page?
Scrapy uses Redirect Middleware for redirects, however it's not enabled in shell. Quick fix for this:
scrapy shell "https://jigsaw.w3.org/HTTP/300/301.html"
fetch(response.headers['Location'])
Also to debug your spider you probably want to inspect the response your spider is receiving:
from scrapy.shell import inspect_response
def parse(self, response)
inspect_response(response, self)
# the spider will stop here and open up an interactive shell during the run
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With