I'm new to scrapy and would like to understand how to scrape on object for output into nested JSON. Right now, I'm producing JSON that looks like
[
{'a' : 1,
'b' : '2',
'c' : 3},
]
And I'd like it more like this:
[
{ 'a' : '1',
'_junk' : [
'b' : 2,
'c' : 3]},
]
---where I put some stuff in _junk
subfields to post-process later.
The current code under the parser definition file in my scrapername.py
is...
item['a'] = x
item['b'] = y
item['c'] = z
And it seemed like
item['a'] = x
item['_junk']['b'] = y
item['_junk']['c'] = z
---might fix that, but I'm getting an error about the _junk
key:
File "/usr/local/lib/python2.7/dist-packages/scrapy/item.py", line 49, in __getitem__
return self._values[key]
exceptions.KeyError: '_junk'
Does this mean I need to change my items.py
somehow? Currently I have:
class Website(Item):
a = Field()
_junk = Field()
b = Field()
c = Field()
You need to create the junk dictionary before storing items in it.
item['a'] = x
item['_junk'] = {}
item['_junk']['b'] = y
item['_junk']['c'] = z
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With