Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas - Convert column to percentage on Groupby DF

Tags:

python

pandas

I have a dataframe that I created by a groupby:

hmdf = pd.DataFrame(hm01)
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
vals1 = ['April    ', 'May      ', 'June     ', 'July     ', 'August   ', 'September', 'October  ', 'November ', 'December ', 'January  ', 'February ', 'March    ']

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
df_hml = df_hm.reindex(vals1)

The DF looks like this:

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April               34          24          22          20
May                 29          26          21          25
June                19          39          22          20
July                23          39          18          20
August              36          30          34           0
September           35          23          41           0
October             36          37          27           0
November            38          31          30           0
December            36          41          23           0
January             34          30          35           0
February            37          26          37           0
March               36          31          33           0

The column names are from variables (threeYr,twoYr,oneYr,Yr), and I want to convert the dataframe so that the numbers are percentages of the total for each column, but I cant get it to work.

This is what I want:

FinancialYear       2014/2015   2015/2016   2016/2017   2017/2018
Month               
April                   9%          6%          6%         24%
May                     7%          7%          6%         29%
June                    5%         10%          6%         24%
July                    6%         10%          5%         24%
August                  9%          8%         10%          0%
September               9%          6%         12%          0%
October                 9%         10%          8%          0%
November               10%          8%          9%          0%
December                9%         11%          7%          0%
January                 9%          8%         10%          0%
February                9%          7%         11%          0%
March                   9%          8%         10%          0%

Could anyone help me with doing this?

Edit: I tried the response found at this link: pandas convert columns to percentages of the totals..... I could not get that to work for my dataframe + it does not explain well (to me) how to make it work for any DF. The response from John Galt I believe is better than that response (my opinion).

like image 831
ScoutEU Avatar asked Aug 09 '17 13:08

ScoutEU


1 Answers

Here's one way

In [1371]: (100. * df / df.sum()).round(0)
Out[1371]:
               2014/2015  2015/2016  2016/2017  2017/2018
FinancialYear
April                9.0        6.0        6.0       24.0
May                  7.0        7.0        6.0       29.0
June                 5.0       10.0        6.0       24.0
July                 6.0       10.0        5.0       24.0
August               9.0        8.0       10.0        0.0
September            9.0        6.0       12.0        0.0
October              9.0       10.0        8.0        0.0
November            10.0        8.0        9.0        0.0
December             9.0       11.0        7.0        0.0
January              9.0        8.0       10.0        0.0
February             9.0        7.0       11.0        0.0
March                9.0        8.0       10.0        0.0

And, if you want to rounded to 1 decimal place with value as strings with '%'

In [1375]: (100. * df / df.sum()).round(1).astype(str) + '%'
Out[1375]:
              2014/2015 2015/2016 2016/2017 2017/2018
FinancialYear
April              8.7%      6.4%      6.4%     23.5%
May                7.4%      6.9%      6.1%     29.4%
June               4.8%     10.3%      6.4%     23.5%
July               5.9%     10.3%      5.2%     23.5%
August             9.2%      8.0%      9.9%      0.0%
September          8.9%      6.1%     12.0%      0.0%
October            9.2%      9.8%      7.9%      0.0%
November           9.7%      8.2%      8.7%      0.0%
December           9.2%     10.9%      6.7%      0.0%
January            8.7%      8.0%     10.2%      0.0%
February           9.4%      6.9%     10.8%      0.0%
March              9.2%      8.2%      9.6%      0.0%
like image 59
Zero Avatar answered Nov 12 '22 04:11

Zero