Is there a way to trigger a method in a Spider class just before it terminates?
I can terminate the spider myself, like this:
class MySpider(CrawlSpider): #Config stuff goes here... def quit(self): #Do some stuff... raise CloseSpider('MySpider is quitting now.') def my_parser(self, response): if termination_condition: self.quit() #Parsing stuff goes here...
But I can't find any information on how to determine when the spider is about to quit naturally.
It looks like you can register a signal listener through dispatcher
.
I would try something like:
from scrapy import signals from scrapy.xlib.pydispatch import dispatcher class MySpider(CrawlSpider): def __init__(self): dispatcher.connect(self.spider_closed, signals.spider_closed) def spider_closed(self, spider): # second param is instance of spder about to be closed.
In the newer version of scrapy scrapy.xlib.pydispatch
is deprecated. instead you can use from pydispatch import dispatcher
.
Just to update, you can just call closed
function like this:
class MySpider(CrawlSpider): def closed(self, reason): do-something()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With