Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Scrapy - Execute code after spider exits

Tags:

python

scrapy

I am not able to find an answer for that question. How can I execute a python code after a scrapy spider exits:

I did the following inside the function which parses the response (def parse_item(self, response):) : self.my_function() Than I defined my_function(), but the problem is that it is still inside the loop of the spider. My main idea is to execute a given code in a function outside the spider's loop with the gathered data. Thanks.

like image 580
Pavel Nikolaev Avatar asked Dec 30 '25 21:12

Pavel Nikolaev


1 Answers

Use the function closed of the Scrapy class as follows:

class MySpider(scrapy.Spider):
    # some attributes
    spider_attr = []

    def parse(self, response):
        # do your logic here
        # page_text = response.xpath('//text()').extract()
        self.spider_attr.append(whatever)

    def closed(self, reason):
        # will be called when the crawler process ends
        # any code 
        # do something with collected data 
        for i in self.spider_attr: 
            print(i)
like image 72
Evhz Avatar answered Jan 02 '26 09:01

Evhz