Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Calculate the percentage of multiple columns, saving the result in new columns - best way

Tags:

python

pandas

My aim is to get the percentage of multiple columns, that are divided by another column. The resulting columns should be kept in the same dataframe.

        A    B  Divisor
2000    8   31  166
2001    39  64  108
2002    68  8   142
2003    28  2   130
2004    55  61  150

result:

        A   B   Divisor  perc_A   perc_B
2000    8   31  166      4.8      18.7
2001    39  64  108      36.1     59.3
2002    68  8   142      47.9     5.6
2003    28  2   130      21.5     1.5
2004    55  61  150      36.7     40.7

My solution:

def percentage(divisor,columns,heading,dframe):
    for col in columns:  
        heading_new = str(heading+col)
        dframe[heading_new] = (dframe.loc[:,col]/dframe.loc[:,divisor])*100   
    return dframe

df_new = division("Divisor",df.columns.values[:2],"perc_",df)

The solution above worked.But is there a more effective way to get the solution?

(I know there are already similar questions. But I couldn't find one, where I can save the results in the same dataframe without loosing the original columns)

Thanks

like image 283
Tomas Avatar asked Jan 25 '23 10:01

Tomas


2 Answers

Use DataFrame.join for add new columns created by DataFrame.div by first 2 columns selected by DataFrame.iloc, multiple by 100 and DataFrame.add_prefix:

df = df.join(df.iloc[:, :2].div(df['Divisor'], axis=0).mul(100).add_prefix('perc_'))
print (df)

       A   B  Divisor     perc_A     perc_B
2000   8  31      166   4.819277  18.674699
2001  39  64      108  36.111111  59.259259
2002  68   8      142  47.887324   5.633803
2003  28   2      130  21.538462   1.538462
2004  55  61      150  36.666667  40.666667

Your function should be changed:

def percentage(divisor,columns,heading,dframe):
    return df.join(df[columns].div(df[divisor], axis=0).mul(100).add_prefix(heading))

df_new = percentage("Divisor",df.columns.values[:2],"perc_",df)
like image 82
jezrael Avatar answered Jan 31 '23 22:01

jezrael


You can reshape the divisor:

df[['perc_A', 'perc_B']] = df[['A', 'B']] / df['Divisor'].values[:,None] * 100
like image 20
Mykola Zotko Avatar answered Jan 31 '23 21:01

Mykola Zotko