I have a pandas dataframe as below:
df=pd.DataFrame({'a':['red','yellow','blue'], 'b':[0,0,1], 'c':[0,1,0], 'd':[1,0,0]})
df
which looks like
a b c d
0 red 0 0 1
1 yellow 0 1 0
2 blue 1 0 0
I want to convert it to a dictionary so that I get:
red d
yellow c
blue b
The dataset if quite large, so please avoid any iterative method. I haven't figured out a solution yet. Any help is appreciated.
First of all, if you really want to convert this to a dictionary, it's a little nicer to convert the value you want as a key into the index of the DataFrame:
df.set_index('a', inplace=True)
This looks like:
b c d
a
red 0 0 1
yellow 0 1 0
blue 1 0 0
Your data appears to be in "one-hot" encoding. You first have to reverse that, using the method detailed here:
series = df.idxmax(axis=1)
This looks like:
a
red d
yellow c
blue b
dtype: object
Almost there! Now and use to_dict
on the 'value' column (this is where setting column a
as the index helps out):
series.to_dict()
This looks like:
{'blue': 'b', 'red': 'd', 'yellow': 'c'}
Which I think is what you are looking for. As a one-liner:
df.set_index('a').idxmax(axis=1).to_dict()
You can try this.
df = df.set_index('a')
df.where(df > 0).stack().reset_index().drop(0, axis=1)
a level_1
0 red d
1 yellow c
2 blue b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With