I'm working on a project with scrapy for a while now, and i wanted to integrate sentry,
I've used scrapy-sentry but it it didn't work at all
i tried also to implement it using Extensions but it works only if an error occurred in the spider's callback (not pipelines.py, items.py)...
from scrapy import signals
from raven import Client
class FailLogger(object):
client = Client(settings.get('SENTRY_DSN'))
@classmethod
def from_crawler(cls, crawler):
ext = cls()
crawler.signals.connect(ext.spider_error, signal=signals.spider_error)
return ext
def spider_error(self, failure, response, spider):
try:
failure.raiseException()
except:
self.client.get_ident(self.client.captureException())
is there any that i can log errors (in spiders, items, pipelines ...) to sentry, like in Django?
Thank you.
It's an old post but my answer may be useful to others. Raven was replaced by sentry-python (named sentry-sdk
in pip). Using this new package, there is a much simpler and complete solution than scrapy-sentry. It's based on the fact that scrapy logging features are based on the stdlib logging module.
You can use the following very simple scrapy extension to catch exceptions and errors inside and outside spiders (including downloader middlewares, item middlewares, etc.).
extensions.py
file of your scrapy project the SentryLogging
extension:import sentry_sdk
from scrapy.exceptions import NotConfigured
class SentryLogging(object):
"""
Send exceptions and errors to Sentry.
"""
@classmethod
def from_crawler(cls, crawler):
sentry_dsn = crawler.settings.get('SENTRY_DSN', None)
if sentry_dsn is None:
raise NotConfigured
# instantiate the extension object
ext = cls()
# instantiate
sentry_sdk.init(sentry_dsn)
# return the extension object
return ext
settings.py
to activate it with low a value to catch exceptions and errors as soon as possible:# Enable or disable extensions
# See https://doc.scrapy.org/en/latest/topics/extensions.html
EXTENSIONS = {
'myproject.extensions.SentryLogging': -1, # Load SentryLogging extension before others
}
# Send exceptions to Sentry
# replace SENTRY_DSN by you own DSN
SENTRY_DSN = "XXXXXXXXXX"
Make sure to replace SENTRY_DSN
by the Sentry DSN of the associated project.
Errors and exceptions inside and outside spiders should now be sent to Sentry. If you want to further customize what is sent to Sentry, you may want to edit the call to sentry_sdk.init()
according to its documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With