Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

yield item issue in errback or in case of 302 Request

Tags:

python

scrapy

I have an issue using scrapy

yield Request(a_url[0],
                    meta={'item': aitem}, dont_filter=True,
                    callback=self.redeem_url, errback=self.error_page)


    def redeem_url(self, response):
       item = response.request.meta['item']
       item['Click_to_Redeem_URL'] = response.url
       yield item

aitem is populated before doing a_url[0] request. Sometimes I get 302, 404, 301 status from Request so what I want is if I can't get 200 response from a_url[0] request item should yield I didn't found any way because when I got 302 scrapy retry on this request and not go to error_page errback, and if in 404 it goes to error_page what would be the way to yield item over there because what I know in errback I can't get response object but a failure object that not contains item in meta.

thanks in advance

like image 798
akhter wahab Avatar asked Nov 05 '22 06:11

akhter wahab


1 Answers

You could try:

yield Request(a_url[0],
                meta={'item': aitem, 'dont_retry':1}, dont_filter=True,
                callback=self.redeem_url, 
                errback=lambda x:self.error_page(x,aitem))


def redeem_url(self, response):
   item = response.request.meta['item']
   item['Click_to_Redeem_URL'] = response.url
   yield item

The dont_retry should stop the scrapy retry:

http://readthedocs.org/docs/scrapy/en/latest/topics/downloader-middleware.html#module-scrapy.contrib.downloadermiddleware.retry

The lambda should allow the aitem to get passed to your error callback.

like image 185
Peter de Rivaz Avatar answered Nov 07 '22 22:11

Peter de Rivaz