Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Scrapy , how to define a pipeline for an item?

I am using scrapy to crawl different sites, for each site I have an Item (different information is extracted)

Well, for example I have a generic pipeline (most of information is the same) but now I am crawling some google search response and the pipeline must be different.

For example:

GenericItem uses GenericPipeline

But the GoogleItem uses GoogleItemPipeline, but when the spider is crawling it tries to use GenericPipeline instead of GoogleItemPipeline....how can I specify which pipeline Google spider must use?

like image 693
llazzaro Avatar asked Jun 29 '09 05:06

llazzaro


People also ask

What is Item pipeline in Scrapy?

Each item pipeline component (sometimes referred as just “Item Pipeline”) is a Python class that implements a simple method. They receive an item and perform an action over it, also deciding if the item should continue through the pipeline or be dropped and no longer processed.

How do you activate the pipeline in Scrapy?

You can activate an Item Pipeline component by adding its class to the ITEM_PIPELINES setting as shown in the following code. You can assign integer values to the classes in the order in which they run (the order can be lower valued to higher valued classes) and values will be in the 0-1000 range.

How does a Scrapy pipeline work?

Scrapy is a web scraping library that is used to scrape, parse and collect web data. For all these functions we are having a pipelines.py file which is used to handle scraped data through various components (known as class) which are executed sequentially.


1 Answers

Now only one way - check Item type in pipeline and process it or return "as is"

pipelines.py:

from grabbers.items import FeedItem

class StoreFeedPost(object):

    def process_item(self, domain, item):
        if isinstance(item, FeedItem):
            #process it...

        return item

items.py:

from scrapy.item import ScrapedItem

class FeedItem(ScrapedItem):
    pass
like image 143
slav0nic Avatar answered Oct 26 '22 20:10

slav0nic