I wanna set a custom parameter in my request so I can retrieve it when I process it in parse_item. This is my code:
def start_requests(self):
yield Request("site_url", meta={'test_meta_key': 'test_meta_value'})
def parse_item(self, response):
print response.meta
parse_item will be called according to the following rules:
self.rules = (
Rule(SgmlLinkExtractor(deny=tuple(self.deny_keywords), allow=tuple(self.client_keywords)), callback='parse_item'),
Rule(SgmlLinkExtractor(deny=tuple(self.deny_keywords), allow=('', ))),
)
According to scrapy doc:
the Response.meta attribute is propagated along redirects and retries, so you will get the original Request.meta sent from your spider.
But I don't see the custom meta in parse_item
. Is there anyway to fix this? Is meta
the right way to go with?
When you generate a new Request
, you need to specify the callback
function, otherwise it will be passed to the parse
method of CrawlSpider as default.
I ran into a similar problem and it took me a while to debug.
callback (callable) – the function that will be called with the response of this request (once its downloaded) as its first parameter. For more information see Passing additional data to callback functions below. If a Request doesn’t specify a callback, the spider’s parse() method will be used. Note that if exceptions are raised during processing, errback is called instead.
method (string) – the HTTP method of this request. Defaults to 'GET'.
meta (dict) – the initial values for the Request.meta attribute. If given, the dict passed in this parameter will be shallow copied.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With