Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas convert dataframe to dictionary with multiple values

I have a dataframe with 2 columns Address and ID. I want to merge IDs with the same addresses in a dictionary

import pandas as pd, numpy as np

df = pd.DataFrame({'Address' : ['12 A', '66 C', '10 B', '10 B', '12 A', '12 A'],
                'ID' : ['Aa', 'Bb', 'Cc', 'Dd', 'Ee', 'Ff']})
AS=df.set_index('Address')['ID'].to_dict()

print df

  Address  ID
0    12 A  Aa
1    66 C  Bb
2    10 B  Cc
3    10 B  Dd
4    12 A  Ee
5    12 A  Ff

print AS

{'66 C': 'Bb', '12 A': 'Ff', '10 B': 'Dd'}

What I want is for the duplicates to store multiple values like:

{'66 C': ['Bb'], '12 A': ['Aa','Ee','Ff'], '10 B': ['Cc','Dd']}
like image 488
user2872701 Avatar asked Nov 21 '13 04:11

user2872701


People also ask

Can I add multiple values to dictionary Python?

General Idea: In Python, if we want a dictionary to have multiple values for a single key, we need to store these values in their own container within the dictionary. To do so, we need to use a container as a value and add our multiple values to that container. Common containers are lists, tuples, and sets.

How do you create a dictionary using multiple values?

In python, if we want a dictionary in which one key has multiple values, then we need to associate an object with each key as value. This value object should be capable of having various values inside it. We can either use a tuple or a list as a value in the dictionary to associate multiple values with a key.

How do you convert a DataFrame to a dictionary in Python?

To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.

Can dictionary have two values?

No, each key in a dictionary should be unique. You can't have two keys with the same value. Attempting to use the same key again will just overwrite the previous value stored. If a key needs to store multiple values, then the value associated with the key should be a list or another dictionary.


1 Answers

I think you can use groupby and a dictionary comprehension here:

>>> df
  Address  ID
0    12 A  Aa
1    66 C  Bb
2    10 B  Cc
3    10 B  Dd
4    12 A  Ee
5    12 A  Ff
>>> {k: list(v) for k,v in df.groupby("Address")["ID"]}
{'66 C': ['Bb'], '12 A': ['Aa', 'Ee', 'Ff'], '10 B': ['Cc', 'Dd']}
like image 121
DSM Avatar answered Sep 17 '22 20:09

DSM