Passing a argument to a callback function

Tags:

def parse(self, response):
    for sel in response.xpath('//tbody/tr'):
        item = HeroItem()
        item['hclass'] = response.request.url.split("/")[8].split('-')[-1]
        item['server'] = response.request.url.split('/')[2].split('.')[0]
        item['hardcore'] = len(response.request.url.split("/")[8].split('-')) == 3
        item['seasonal'] = response.request.url.split("/")[6] == 'season'
        item['rank'] = sel.xpath('td[@class="cell-Rank"]/text()').extract()[0].strip()
        item['battle_tag'] = sel.xpath('td[@class="cell-BattleTag"]//a/text()').extract()[1].strip()
        item['grift'] = sel.xpath('td[@class="cell-RiftLevel"]/text()').extract()[0].strip()
        item['time'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip()
        item['date'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip()
        url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip()

        yield Request(url, callback=self.parse_profile)

def parse_profile(self, response):
    sel = Selector(response)
    item = HeroItem()
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    return item

Well, I'm scraping a whole table in the main parse method and I have taken several fields from that table. One of these fields is an url and I want to explore it to get a whole new bunch of fields. How can I pass my already created ITEM object to the callback function so the final item keeps all the fields?

As it is shown in the code above, I'm able to save the fields inside the url (code at the moment) or only the ones in the table (simply write yield item) but I can't yield only one object with all the fields together.

I have tried this, but obviously, it doesn't work.

Click to copy

yield Request(url, callback=self.parse_profile(item))

def parse_profile(self, response, item):
    sel = Selector(response)
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    return item

388

asked Aug 27 '15 14:08

vic

2 Answers

This is what you'd use the meta Keyword for.

Click to copy

def parse(self, response):
    for sel in response.xpath('//tbody/tr'):
        item = HeroItem()
        # Item assignment here
        url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip()

        yield Request(url, callback=self.parse_profile, meta={'hero_item': item})

def parse_profile(self, response):
    item = response.meta.get('hero_item')
    item['weapon'] = response.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    yield item

Also note, doing sel = Selector(response) is a waste of resources and differs from what you did earlier, so I changed it. It's automatically mapped in the response as response.selector, which also has the convenience shortcut of response.xpath.

186

answered Oct 16 '22 21:10

Rejected

Here's a better way to pass args to callback function:

Click to copy

def parse(self, response):
    request = scrapy.Request('http://www.example.com/index.html',
                             callback=self.parse_page2,
                             cb_kwargs=dict(main_url=response.url))
    request.cb_kwargs['foo'] = 'bar'  # add more arguments for the callback
    yield request

def parse_page2(self, response, main_url, foo):
    yield dict(
        main_url=main_url,
        other_url=response.url,
        foo=foo,
    )

source: https://docs.scrapy.org/en/latest/topics/request-response.html#topics-request-response-ref-request-callback-arguments

answered Oct 16 '22 22:10

penduDev

Related questions
                            
                                Using Python, write an Excel file with columns copied from another Excel file [closed]
                            
                                xlsx and xlsm files return badzipfile: file is not a zip file
                            
                                How to get syntax highlighting on Kivy, .kv, file in Pycharm on OSX? [duplicate]
                            
                                Return or yield from a function that calls a generator?
                            
                                Matching only a unicode letter in Python re
                            
                                Python 3 operator >> to print to file
                            
                                pyplot zooming in
                            
                                unable to call firefox from selenium in python on AWS machine
                            
                                python, writing Json to file [duplicate]
                            
                                How do I pass a PK or slug to a DetailView using RequestFactory in Django?
                            
                                scikit-learn - ROC curve with confidence intervals
                            
                                NumPy append vs concatenate
                            
                                How to set weights in Keras with a numpy array?
                            
                                In python, how to tweak Black formatter, if possible?
                            
                                python libraries for ssh handling
                            
                                Combine duplicated columns within a DataFrame
                            
                                Extract time from datetime and determine if time (not date) falls within range?
                            
                                How do I automatically fix an invalid JSON string?
                            
                                Permission denied error while writing to a file in Python
                            
                                Get the file path for a static file in django code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Passing a argument to a callback function

Tags:

python

arguments

callback

scrapy

vic

People also ask

2 Answers

Rejected

penduDev

Recent Activity

Donate For Us