Lets say I have a pandas data frame with 2 columns(column A and Column B): For values in column 'A' there are multiple values in column 'B'. I want to create a dictionary with multiple values for each key those values should be unique as well. Please suggest me a way to do this.
In python, if we want a dictionary in which one key has multiple values, then we need to associate an object with each key as value. This value object should be capable of having various values inside it. We can either use a tuple or a list as a value in the dictionary to associate multiple values with a key.
By using the dictionary. update() function, we can easily append the multiple values in the existing dictionary. In Python, the dictionary. update() method will help the user to update the dictionary elements or if it is not present in the dictionary then it will insert the key-value pair.
To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.
We can create a pandas Series object by using a python dictionary by sending the dictionary data to the pandas Series method i.e. pandas. Series(). This pandas Series method will create a new Series object with the keys and value pairs from the python dictionary.
One way is to groupby columns A:
In [1]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
In [2]: df
Out[2]:
A B
0 1 2
1 1 4
2 5 6
In [3]: g = df.groupby('A')
Apply tolist
on each of the group's column B:
In [4]: g['B'].tolist() # shorthand for .apply(lambda s: s.tolist()) "automatic delegation"
Out[4]:
A
1 [2, 4]
5 [6]
dtype: object
And then call to_dict
on this Series:
In [5]: g['B'].tolist().to_dict()
Out[5]: {1: [2, 4], 5: [6]}
If you want these to be unique, use unique
(Note: this will create a numpy array rather than a list):
In [11]: df = pd.DataFrame([[1, 2], [1, 2], [5, 6]], columns=['A', 'B'])
In [12]: g = df.groupby('A')
In [13]: g['B'].unique()
Out[13]:
A
1 [2]
5 [6]
dtype: object
In [14]: g['B'].unique().to_dict()
Out[14]: {1: array([2]), 5: array([6])}
Other alternatives are to use .apply(lambda s: set(s))
, .apply(lambda s: list(set(s)))
, .apply(lambda s: list(s.unique()))
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With