Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle multiple keys for a dictionary in python?

I've been searching about how to go for adding multiple val for single keys in a Dict if a duplicate key is found.

Let's take an example:

list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
new_dict = dict(zip(list_1,list_2))
...output...
{'8': 'd', '4': 'a', '6': 'b'}

Expected output :

{'8': 'c,d', '4': 'a', '6': 'b'}

In order to process the above two list and combine them into one dict, i would face a certain challenge that we can't have two 8's in the 'key' of dict, which is a default behavior and i understand why !!

Some of the options that exists to process such scenario are :

1) Find if 'key' already exists in dict, if yes, then append the new val to 'key'

2) Create a mutable object to reference each key and in that way you can have multiple dup keys ~~Not really my use case

So, how can i go about for expected output using option#1 ?

like image 846
PanDe Avatar asked Mar 07 '23 02:03

PanDe


2 Answers

defaultdict/dict.setdefault

Let's jump into it:

  1. Iterate over items consecutively
  2. Append string values belonging to the same key
  3. Once done, iterate over each key-value pair and join everything together for your final result.

from collections import defaultdict

d = defaultdict(list)   
for i, j in zip(list_1, list_2):
    d[i].append(j)

The defaultdict makes things simple, and is efficient with appending. If you don't want to use a defaultdict, use dict.setdefault instead (but this is a bit more inefficient):

d = {}
for i, j in zip(list_1, list_2):
    d.setdefault(i, []).append(j)

new_dict = {k : ','.join(v) for k, v in d.items()})
print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}

Pandas DataFrame.groupby + agg

If you want performance at high volumes, try using pandas:

import pandas as pd

df = pd.DataFrame({'A' : list_1, 'B' : list_2})
new_dict = df.groupby('A').B.agg(','.join).to_dict()

print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}
like image 92
cs95 Avatar answered Mar 16 '23 08:03

cs95


You can do it with a for loop that iterates over the two lists:

list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']

new_dict = {}
for k, v in zip(list_1, list_2):
    if k in new_dict:
        new_dict[k] += ', ' + v
    else:
        new_dict[k] = v

There might be efficiency problems for huge dictionaries, but it will work just fine in simple cases.

Thanks to @Ev. Kounis and @bruno desthuilliers that pointed out a few improvements to the original answer.


coldspeed's answer is more efficient than mine, I keep this one here because it is still correct and I don't see the point in deleting it.

like image 29
Gianluca Micchi Avatar answered Mar 16 '23 09:03

Gianluca Micchi