Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scrapy item loader return list not single value

I am using scrapy 0.20.

I want to use item loader

this is my code:

l = XPathItemLoader(item=MyItemClass(), response=response)
        l.add_value('url', response.url)
        l.add_xpath('title',"my xpath")
        l.add_xpath('developer', "my xpath")
return l.load_item()

I got the result in the json file. the url is a list. The title is a list. The developer is a list.

How to extract single value instead of the list?

Should I make an item pipeline for that? I hope there is a faster way

like image 399
Marco Dinatsoli Avatar asked May 27 '14 16:05

Marco Dinatsoli


1 Answers

You need to set an Input or Output processor. TakeFirst would work perfectly in your case.

There are multiple places where you can define it, e.g. in the Item definition:

from scrapy.item import Item, Field
from scrapy.loader.processors import TakeFirst

class MyItem(Item):
    url = Field(output_processor=TakeFirst())
    title = Field(output_processor=TakeFirst())
    developer = Field(output_processor=TakeFirst())

Or, set a default_output_processor on a XpathItemLoader() instance:

l.default_output_processor = TakeFirst()
like image 153
alecxe Avatar answered Nov 01 '22 16:11

alecxe