Assume I have the following dictionaries:
{name: "john", place: "nyc", owns: "gold", quantity: 30}
{name: "john", place: "nyc", owns: "silver", quantity: 20}
{name: "jane", place: "nyc", owns: "platinum", quantity: 5}
{name: "john", place: "chicago", owns: "brass", quantity: 60}
{name: "john", place: "chicago", owns: "silver", quantity: 40}
And I have hundreds of these small dictionaries. I have to merge them with a subset of common keys, in this example (name, place) and create a new dictionary. Ultimately, the output should look like the following:
{name: "john", place: "nyc", gold: 30, silver: 20}
{name: "jane", place: "nyc", platinum: 5}
{name: "john", place: "chicago", brass: 60, silver: 40}
Is there any efficient way to do this? All I can think of is brute-force, where I will keep track of every possible name-place combination, store in some list, traverse the entire thing again for each combination and merge the dictionaries into a new one. Thanks!
First, getting the output that you asked for:
data = [{'name': "john", 'place': "nyc", 'owns': "gold", 'quantity': 30},
{'name': "john", 'place': "nyc", 'owns': "silver", 'quantity': 20},
{'name': "jane", 'place': "nyc", 'owns': "platinum", 'quantity': 5},
{'name': "john", 'place': "chicago", 'owns': "brass", 'quantity': 60},
{'name': "john", 'place': "chicago", 'owns': "silver", 'quantity': 40}]
from collections import defaultdict
accumulator = defaultdict(list)
for p in data:
accumulator[p['name'],p['place']].append((p['owns'],p['quantity']))
from itertools import chain
[dict(chain([('name',name), ('place',place)], rest)) for (name,place),rest in accumulator.iteritems()]
Out[13]:
[{'name': 'jane', 'place': 'nyc', 'platinum': 5},
{'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
{'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}]
Now I have to point out that this list-of-dicts data structure you've asked for is super awkward. Dicts are great for lookups, but they perform best when you can just use one for the whole group of objects - if you have to linearly search through a bunch of dicts to find the one you want, you've immediately lost the whole benefit that dict provides in the first place. So that leaves us with a couple of options. Go one level deeper - nest dicts within our dict, or use something else entirely.
May I suggest making a list of meaningful objects which each represent one of these people? Either create your own class, or use a namedtuple:
from collections import namedtuple
Person = namedtuple('Person','name place holdings')
[Person(name, place, dict(rest)) for (name,place), rest in accumulator.iteritems()]
Out[17]:
[Person(name='jane', place='nyc', holdings={'platinum': 5}),
Person(name='john', place='chicago', holdings={'brass': 60, 'silver': 40}),
Person(name='john', place='nyc', holdings={'silver': 20, 'gold': 30})]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With