I have the following 'worker' that initially returned a single JSON object, but I would like it to return multiple JSON objects:
def data_worker(data):
_cats, index, total = data
_breeds = {}
try:
url = _channels['feedUrl']
r = get(url, timeout=5)
rss = etree.XML(r.content)
tags = rss.xpath('//cats/item')
_cats['breeds'] = {}
for t in tags:
_cats['breeds']["".join(t.xpath('breed/@url'))] = True
_breeds['url'] = "".join(t.xpath('breed/@url'))
return [_cats, _breeds]
except:
return [_cats, _breeds]
This worker is a parameter for a multiprocessing pool:
cats, breeds = pool.map(data_worker, data, chunksize=1)
When I run the pool and the worker with just one output (i.e. _cats), it works just fine, but when I try to return multiple JSON "schemas," I get the error:
File "crawl.py", line 111, in addFeedData
[cats, breeds] = pool.map(data_worker, data, chunksize=1)
ValueError: too many values to unpack
How can I return 2 separate JSON objects in data_worker? I need them to be separate JSON objects. Note, I have already tried the following, which did not work:
[cats, breeds] = pool.map(data_worker, data, chunksize=1)
(cats, breeds) = pool.map(data_worker, data, chunksize=1)
return (_cats, _breeds)
First of all I think you meant to write this:
cats, breeds = pool.map(data_worker, data, chunksize=1)
But anyway this won't work, because data_worker returns a pair, but map() returns a list of whatever the worker returns. So you should do this:
cats = []
breeds = []
for cat, breed in pool.map(data_worker, data, chunksize=1):
cats.append(cat)
breeds.append(breed)
This will give you the two lists you seek.
In other words, you expected a pair of lists, but you got a list of pairs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With