Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy Spider - For loop within response callback not iterating

Tags:

python

scrapy

I am trying to use the link parsing structure described by "warwaruk" in this SO thread: Following links, Scrapy web crawler framework

This works great when only grabbing a single item from each page. However, when I try to create a for loop to scrape all items within each page, it appears that the parse_item function terminates upon reaching the first yield statement. I have a custom pipeline setup to handle each item, but currently it only receives one item per page.

Let me know if I need to include more code, or clarification. THANKS!

def parse_item(self,response):  
    hxs = HtmlXPathSelector(response)
    prices = hxs.select("//div[contains(@class, 'item')]/script/text()").extract()
    for prices in prices:
        item = WalmartSampleItem()
        ...
        yield items
like image 850
Tyler Avatar asked Mar 29 '26 19:03

Tyler


1 Answers

You should yield a single item in the for loop, not items:

for prices in prices:
    item = WalmartSampleItem()
    ...
    yield item
like image 163
alecxe Avatar answered Mar 31 '26 10:03

alecxe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!