Column names are: ID,1,2,3,4,5,6,7,8,9.
The col values are either 0 or 1
My dataframe looks like this:
ID 1 2 3 4 5 6 7 8 9 1002 0 1 0 1 0 0 0 0 0 1003 0 0 0 0 0 0 0 0 0 1004 1 1 0 0 0 0 0 0 0 1005 0 0 0 0 1 0 0 0 0 1006 0 0 0 0 0 1 0 0 0 1007 1 0 1 0 0 0 0 0 0 1000 0 0 0 0 0 0 0 0 0 1009 0 0 1 0 0 0 1 0 0
I want the column names in front of the ID where the value in a row is 1.
The Dataframe i want should look like this:
ID Col2 1002 2 // has 1 at Col(2) and Col(4) 1002 4 1004 1 // has 1 at col(1) and col(2) 1004 2 1005 5 // has 1 at col(5) 1006 6 // has 1 at col(6) 1007 1 // has 1 at col(1) and col(3) 1007 3 1009 3 // has 1 at col(3) and col(7) 1009 7
Please help me in this, Thanks in advance
get_dummies() is used for data manipulation. It converts categorical data into dummy or indicator variables.
Pandas get_dummiesThe get_dummies method of Pandas is another way to create one-hot encoded features. data — the dataframe on which you want to apply one-hot encoding.
get_dummies there is a parameter i.e. drop_first allows you whether to keep or remove the reference (whether to keep k or k-1 dummies out of k categorical levels).
Pretty one-liner :)
new_df = df.idxmax(axis=1)
set_index
+ stack
, stack will dropna by default
df.set_index('ID',inplace=True) df[df==1].stack().reset_index().drop(0,1) Out[363]: ID level_1 0 1002 2 1 1002 4 2 1004 1 3 1004 2 4 1005 5 5 1006 6 6 1007 1 7 1007 3 8 1009 3 9 1009 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With