Relevant Code
def start_requests( self ):
requests = [ Request( url['url'], meta=url['meta'], callback=self.parse, errback=self.handle_error ) for url in self.start_urls if valid_url( url['url'] )]
return requests
def handle_error( self, err ):
# Errors being saved in DB
# So I don't want them displayed in the logs
I've got my own code for saving error codes in DB. I don't want them displayed in the log output. How can I suppress these errors?
Note that I don't want to suppress all errors - just the ones being handled here.
Try to use self.skipped.add, self.failed.add with isinstance condition in your handle_error method.
Here is an example
def on_error(self, failure):
if isinstance(failure.value, HttpError):
response = failure.value.response
if response.status in self.bypass_status_codes:
self.skipped.add(response.url[-3:])
return self.parse(response)
# it assumes there is a response attached to failure
self.failed.add(failure.value.response.url[-3:])
return failure
Answer by @Daniil Mashkin seems to be the most comprehensive solution.
For simple cases, you can add http error codes Spider.handle_httpstatus_list or HTTPERROR_ALLOWED_CODES in Settings.py.
This will send some erroneous answers to your callback function, thus skipping logging as well
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With