I've been searching about how to go for adding multiple val for single keys in a Dict if a duplicate key is found.
Let's take an example:
list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
new_dict = dict(zip(list_1,list_2))
...output...
{'8': 'd', '4': 'a', '6': 'b'}
Expected output :
{'8': 'c,d', '4': 'a', '6': 'b'}
In order to process the above two list and combine them into one dict, i would face a certain challenge that we can't have two 8's in the 'key' of dict, which is a default behavior and i understand why !!
Some of the options that exists to process such scenario are :
1) Find if 'key' already exists in dict, if yes, then append the new val to 'key'
2) Create a mutable object to reference each key and in that way you can have multiple dup keys ~~Not really my use case
So, how can i go about for expected output using option#1 ?
defaultdict
/dict.setdefault
Let's jump into it:
from collections import defaultdict
d = defaultdict(list)
for i, j in zip(list_1, list_2):
d[i].append(j)
The defaultdict
makes things simple, and is efficient with appending. If you don't want to use a defaultdict
, use dict.setdefault
instead (but this is a bit more inefficient):
d = {}
for i, j in zip(list_1, list_2):
d.setdefault(i, []).append(j)
new_dict = {k : ','.join(v) for k, v in d.items()})
print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}
DataFrame.groupby
+ agg
If you want performance at high volumes, try using pandas:
import pandas as pd
df = pd.DataFrame({'A' : list_1, 'B' : list_2})
new_dict = df.groupby('A').B.agg(','.join).to_dict()
print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}
You can do it with a for
loop that iterates over the two lists:
list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
new_dict = {}
for k, v in zip(list_1, list_2):
if k in new_dict:
new_dict[k] += ', ' + v
else:
new_dict[k] = v
There might be efficiency problems for huge dictionaries, but it will work just fine in simple cases.
Thanks to @Ev. Kounis and @bruno desthuilliers that pointed out a few improvements to the original answer.
coldspeed's answer is more efficient than mine, I keep this one here because it is still correct and I don't see the point in deleting it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With