Given n dictionaries, write a function that will return a unique dictionary with a list of values for duplicate keys.
Example:
d1 = {'a': 1, 'b': 2}
d2 = {'c': 3, 'b': 4}
d3 = {'a': 5, 'd': 6}
result:
>>> newdict
{'c': 3, 'd': 6, 'a': [1, 5], 'b': [2, 4]}
My code so far:
>>> def merge_dicts(*dicts):
... x = []
... for item in dicts:
... x.append(item)
... return x
...
>>> merge_dicts(d1, d2, d3)
[{'a': 1, 'b': 2}, {'c': 3, 'b': 4}, {'a': 5, 'd': 6}]
What would be the best way to produce a new dictionary that yields a list of values for those duplicate keys?
Python provides a simple and fast solution to this: the defaultdict
in the collections
module. From the examples in the documentation:
Using
list
as thedefault_factory
, it is easy to group a sequence of key-value pairs into a dictionary of lists:>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items() [('blue', [2, 4]), ('red', 1), ('yellow', [1, 3])]When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the
default_factory
function which returns an empty list. Thelist.append()
operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and thelist.append()
operation adds another value to the list.
In your case, that would be roughly:
import collections
def merge_dicts(*dicts):
res = collections.defaultdict(list)
for d in dicts:
for k, v in d.iteritems():
res[k].append(v)
return res
>>> merge_dicts(d1, d2, d3)
defaultdict(<type 'list'>, {'a': [1, 5], 'c': [3], 'b': [2, 4], 'd': [6]})
def merge_dicts(*dicts):
d = {}
for dict in dicts:
for key in dict:
try:
d[key].append(dict[key])
except KeyError:
d[key] = [dict[key]]
return d
This retuns:
{'a': [1, 5], 'b': [2, 4], 'c': [3], 'd': [6]}
There is a slight difference to the question. Here all dictionary values are lists. If that is not to be desired for lists of length 1, then add:
for key in d:
if len(d[key]) == 1:
d[key] = d[key][0]
before the return d
statement. However, I cannot really imagine when you would want to remove the list. (Consider the situation where you have lists as values; then removing the list around the items leads to ambiguous situations.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With