Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i get the spider name in images pipeline

Tags:

python

scrapy

I have many images pipeline but i want to use the different saving methods for different spiders.

I know that in other pipelines i can use spider.name but how can i get that in the image spipeline

class MyImagesPipeline(ImagesPipeline):
   if spider.name  in ['first']:
    def get_media_requests(self, item, info):
like image 692
user19140477031 Avatar asked Jan 03 '13 06:01

user19140477031


2 Answers

The spider is passed as an argument to process_item:

https://scrapy.readthedocs.org/en/latest/topics/item-pipeline.html#item-pipeline-example

You could either set a variable during evaluation for class wide usage, or implement a hook yourself if you need the spider before process_item is called.

class MyImagesPipeline(ImagesPipeline):
    spider = None

    def process_item(self, item, spider):
        self.spider = spider
        if self.spider.name in ['first']:
            get_media_requests(item, info)
        return item

    def get_media_requests(self, item, info):
        # whatever

You could also retrieve the info directly from the base class, which has an inner meta class SpiderInfo with a spider attribute.

see: https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/pipeline/media.py

like image 141
Hedde van der Heide Avatar answered Oct 24 '22 01:10

Hedde van der Heide


info.spider is what you want.

def get_media_requests(self, item, info):
    info.spider.name
like image 25
Umair Ayub Avatar answered Oct 24 '22 00:10

Umair Ayub