I have two large dictionaries. This is an example to demonstrate but you can imagine each dictionary having close to 100k records:
d1 = {
'0001': [('skiing',0.789),('snow',0.65),('winter',0.56)],
'0002': [('drama', 0.89),('comedy', 0.678),('action',-0.42),
('winter',-0.12),('kids',0.12)]
}
d2 = {
'0001': [('action', 0.89),('funny', 0.58),('sports',0.12)],
'0002': [('dark', 0.89),('Mystery', 0.678),('crime',0.12), ('adult',-0.423)]
}
I want to have a dictionary that has combined values by key from each dictionary:
{
'0001': [
('skiing', 0.789), ('snow', 0.65), ('winter', 0.56),
[('action', 0.89), ('funny', 0.58), ('sports', 0.12)]
],
'0002': [
('drama', 0.89), ('comedy', 0.678), ('action', -0.42),
('winter', -0.12), ('kids', 0.12),
[('dark', 0.89), ('Mystery', 0.678), ('crime', 0.12), ('adult', -0.423)]
]
}
The way I would achieve this is:
for key, value in d1.iteritems():
if key in d2:
d1[key].append(d2[key])
But after reading in many places I found out that iteritems()
is really slow and doesn't actually use C data structures to do it, but uses Python functions. How can I do this combine/merge process fast and efficiently?
for k, v in d2.items():
if k in d1:
d1[k].extend(v)
else:
d1[k] = v
I think you need to merge the dicts
from collections import Counter
res = Counter(d1) + Counter(d2)
>>>res
Counter({'0001': [('skiing', 0.789), ('snow', 0.65), ('winter', 0.56 **...**
For example
from collections import Counter
d1 = {"a":[1,2], "b":[]}
d2 = {"a":[1,3], "b":[5,6]}
res = Counter(d1)+Counter(d2)
>>>res
Counter({'b': [5, 6], 'a': [1, 2, 1, 3]})
Even this approach support unequal number of keys in dicts
, like
d1 = {"a":[1,2], "b":[]}
d2 = {"a":[1,3], "b":[5,6], "c":["ff"]}
>>>res
Counter({'c': ['ff'], 'b': [5, 6], 'a': [1, 2, 1, 3]})
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With