Relevant Code
def start_requests( self ):
requests = [ Request( url['url'], meta=url['meta'], callback=self.parse, errback=self.handle_error ) for url in self.start_urls if valid_url( url['url'] )]
return requests
def handle_error( self, err ):
# Errors being saved in DB
# So I don't want them displayed in the logs
I've got my own code for saving error codes in DB. I don't want them displayed in the log output. How can I suppress these errors?
Note that I don't want to suppress all errors - just the ones being handled here.
Try to use self.skipped.add
, self.failed.add
with isinstance
condition in your handle_error
method.
Here is an example
def on_error(self, failure):
if isinstance(failure.value, HttpError):
response = failure.value.response
if response.status in self.bypass_status_codes:
self.skipped.add(response.url[-3:])
return self.parse(response)
# it assumes there is a response attached to failure
self.failed.add(failure.value.response.url[-3:])
return failure
Answer by @Daniil Mashkin seems to be the most comprehensive solution.
For simple cases, you can add http error codes Spider.handle_httpstatus_list
or HTTPERROR_ALLOWED_CODES
in Settings.py
.
This will send some erroneous answers to your callback function, thus skipping logging as well
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With