Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create multiple value dictionary from pandas data frame

Lets say I have a pandas data frame with 2 columns(column A and Column B): For values in column 'A' there are multiple values in column 'B'. I want to create a dictionary with multiple values for each key those values should be unique as well. Please suggest me a way to do this.

like image 615
Akshay Avatar asked Sep 24 '13 16:09

Akshay


People also ask

How do you create a dictionary using multiple values?

In python, if we want a dictionary in which one key has multiple values, then we need to associate an object with each key as value. This value object should be capable of having various values inside it. We can either use a tuple or a list as a value in the dictionary to associate multiple values with a key.

Can I add multiple values to dictionary Python?

By using the dictionary. update() function, we can easily append the multiple values in the existing dictionary. In Python, the dictionary. update() method will help the user to update the dictionary elements or if it is not present in the dictionary then it will insert the key-value pair.

How do you convert a DataFrame to a dictionary?

To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.

How do you create a Pandas dictionary from a value?

We can create a pandas Series object by using a python dictionary by sending the dictionary data to the pandas Series method i.e. pandas. Series(). This pandas Series method will create a new Series object with the keys and value pairs from the python dictionary.


1 Answers

One way is to groupby columns A:

In [1]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])

In [2]: df
Out[2]:
   A  B
0  1  2
1  1  4
2  5  6

In [3]: g = df.groupby('A')

Apply tolist on each of the group's column B:

In [4]: g['B'].tolist()  # shorthand for .apply(lambda s: s.tolist()) "automatic delegation"
Out[4]:
A
1    [2, 4]
5       [6]
dtype: object

And then call to_dict on this Series:

In [5]: g['B'].tolist().to_dict()
Out[5]: {1: [2, 4], 5: [6]}

If you want these to be unique, use unique (Note: this will create a numpy array rather than a list):

In [11]: df = pd.DataFrame([[1, 2], [1, 2], [5, 6]], columns=['A', 'B'])

In [12]: g = df.groupby('A')

In [13]: g['B'].unique()
Out[13]:
A
1    [2]
5    [6]
dtype: object

In [14]: g['B'].unique().to_dict()
Out[14]: {1: array([2]), 5: array([6])}

Other alternatives are to use .apply(lambda s: set(s)), .apply(lambda s: list(set(s))), .apply(lambda s: list(s.unique()))...

like image 83
Andy Hayden Avatar answered Jan 01 '23 13:01

Andy Hayden