I have a dataset with multiple columns that I wish to one hot encode. However, I don't want to have the encoding for each one of them since said columns are related to the said items. What I want is one "set" of dummies variables that uses all the columns. See my code for a better explanation.
Suppose my dataframe looks like this:
In [103]: dum = pd.DataFrame({'ch1': ['A', 'C', 'A'], 'ch2': ['B', 'G', 'F'], 'ch3': ['C', 'D', 'E']})
In [104]: dum
Out[104]:
ch1 ch2 ch3
0 A B C
1 C G D
2 A F E
If I execute
pd.get_dummies(dum)
The output will be
ch1_A ch1_C ch2_B ch2_F ch2_G ch3_C ch3_D ch3_E
0 1 0 1 0 0 1 0 0
1 0 1 0 0 1 0 1 0
2 1 0 0 1 0 0 0 1
However, what I would like to obtain is something like this:
A B C D E F G
1 1 1 0 0 0 0
0 0 1 1 0 0 1
1 0 0 0 1 1 0
Instead of having multiple columns representing the encoding, e.g. ch1_A
and ch1_C
, I only wish to have one group (A
, B
, and so on) with value 1
when any of the values in the columns ch1
, ch2
, ch3
show up.
To clarify, in my original dataset, a single row won't contain the same value (A,B,C...) more than once; it will just appear on one of the columns.
get_dummies() is used for data manipulation. It converts categorical data into dummy or indicator variables.
To select multiple columns, you can pass a list of column names to the indexing operator. Alternatively, you can assign all your columns to a list variable and pass that variable to the indexing operator.
Using stack
and str.get_dummies
dum.stack().str.get_dummies().sum(level=0)
Out[938]:
A B C D E F G
0 1 1 1 0 0 0 0
1 0 0 1 1 0 0 1
2 1 0 0 0 1 1 0
You could use pd.crosstab
to create a frequency table:
import pandas as pd
dum = pd.DataFrame({'ch1': ['A', 'C', 'A'], 'ch2': ['B', 'G', 'F'], 'ch3': ['C', 'D', 'E']})
stacked = dum.stack()
index = stacked.index.get_level_values(0)
result = pd.crosstab(index=index, columns=stacked)
result.index.name = None
result.columns.name = None
print(result)
yields
A B C D E F G
0 1 1 1 0 0 0 0
1 0 0 1 1 0 0 1
2 1 0 0 0 1 1 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With