Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access scrapy stats from a pipeline

Tags:

python

scrapy

From the scrapy api I know that a crawler contains the stats attribute, but how can I access it from a custom pipeline?

class MyPipeline(object):

    def __init__(self): 
        self.stats = ???
like image 779
gusridd Avatar asked Dec 05 '22 20:12

gusridd


2 Answers

Your pipeline is an extension and you want it to access the stats attribute. An extension receives the Crawler object through the from_crawler(cls, crawler) method.

All in all, you should do something like

def __init__(self, stats):
    self.stats = stats

@classmethod
def from_crawler(cls, crawler):
    return cls(crawler.stats)

http://scrapy.readthedocs.org/en/latest/topics/stats.html#topics-stats

like image 169
jbahamon Avatar answered Dec 14 '22 20:12

jbahamon


also stats available from spider.crawler, for example (v1.1.0):

class ObjPipeline(object):
    def process_item(self, item, spider):
        spider.crawler.stats.inc_value('scraped_items')
        ...
like image 26
slavugan Avatar answered Dec 14 '22 20:12

slavugan