Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - One Hot Encoding using values from another Column

Tags:

python

pandas

I have a dataframe as below,

idx vol loc             
0   1   2   unit 68             
1   1   1   unit 179                
2   1   2   unit 345                
3   2   1   unit 233                
4   2   1   unit 235                
5   2   2   unit 313                
6   2   1   unit 313    

I aim to do one-hot encoding for the 'loc' column with the 'vol' values filling in the one-hot encoded columns. Keen to use 'get_dummies' and 'groupby' methods in pandas.

My expected output is as below. My challenge is I am unable to fill the values from the 'vol' column in the newly created one-hot encoded columns.

Tried to spend 2 days but still have not found a workaround for this. Hope you can give me a hint.

Expected

loc unit 179    unit 233    unit 235    unit 312    unit 313    unit 345    unit 68
id                          
1   1   0   0   0   0   2   2
2   0   1   1   0   3   0   0

MWE

import pandas as pd

data =  {'idx': [1, 1, 1, 2, 2, 2, 2],
         'vol': [2, 1, 2, 1, 1, 2, 1],
         'loc': ['unit 68', 'unit 179', 'unit 345', 'unit 233', 'unit 235', 'unit 313', 'unit 313']}

df = pd.DataFrame(data)

# Syntax
# pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None) → ’DataFrame’

df = pd.get_dummies(df, columns=['loc'], prefix='', prefix_sep='')
print(df)

df = df.groupby('idx').agg(sum)
print(df)

Output

   idx  vol  unit 179  unit 233  unit 235  unit 313  unit 345  unit 68
0    1    2         0         0         0         0         0        1
1    1    1         1         0         0         0         0        0
2    1    2         0         0         0         0         1        0
3    2    1         0         1         0         0         0        0
4    2    1         0         0         1         0         0        0
5    2    2         0         0         0         1         0        0
6    2    1         0         0         0         1         0        0

     vol  unit 179  unit 233  unit 235  unit 313  unit 345  unit 68
idx                                                                
1      5         1         0         0         0         1        1
2      5         0         1         1         2         0        0
like image 763
Saravanan K Avatar asked Dec 03 '25 04:12

Saravanan K


2 Answers

You can do

df=df.set_index('idx')
s=df['loc'].str.get_dummies().mul(df['vol'],axis=0).sum(level=0)
     unit179  unit233  unit235  unit313  unit345  unit68
idx                                                     
1          1        0        0        0        2       2
2          0        1        1        3        0       0
like image 96
BENY Avatar answered Dec 04 '25 20:12

BENY


IIUC,

df.pivot_table('vol', 'idx', 'loc', aggfunc='sum', fill_value=0)

Output:

loc  unit 179  unit 233  unit 235  unit 313  unit 345  unit 68
idx                                                           
1           1         0         0         0         2        2
2           0         1         1         3         0        0

Add totals also,

df.pivot_table('vol', 'idx', 'loc', aggfunc='sum', 
               fill_value=0, margins=True, margins_name='vol')

Output:

loc  unit 179  unit 233  unit 235  unit 313  unit 345  unit 68  vol
idx                                                                
1           1         0         0         0         2        2    5
2           0         1         1         3         0        0    5
vol         1         1         1         3         2        2   10

Drop row grand total:

df.pivot_table('vol', 'idx', 'loc', aggfunc='sum', 
               fill_value=0, margins=True, margins_name='vol')\
  .drop('vol')

Output:

loc  unit 179  unit 233  unit 235  unit 313  unit 345  unit 68  vol
idx                                                                
1           1         0         0         0         2        2    5
2           0         1         1         3         0        0    5
like image 30
Scott Boston Avatar answered Dec 04 '25 19:12

Scott Boston



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!