Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting % Rate using Pandas Group By and .sum()

I'd like to get some % rates based on a .groupby() in pandas. My goal is to take an indicator column Ind and get the Rate of A (numerator) divided by the total (A+B) in that year

Example Data:

import pandas as pd
import numpy as np
df: pd.DataFrame = pd.DataFrame([['2011','A',1,2,3], ['2011','B',4,5,6],['2012','A',15,20,4],['2012','B',17,12,12]], columns=["Year","Ind","X", "Y", "Z"])
print(df)
   Year Ind   X   Y   Z
0  2011   A   1   2   3
1  2011   B   4   5   6
2  2012   A  15  20   4
3  2012   B  17  12  12

Example for year 2011: XRate would be summing up the A indicators for X (which would be 1) and dividing byt the total (A+B) which would be 5 thus I would receive an Xrate of 0.20.

I would like to do this for all columns X, Y, Z to get the rates. I've tried doing lambda applys but can't quite get the desired results.

Desired Results:

   Year XRate YRate  ZRate
0  2011  0.20  0.29   0.33 
1  2012  0.47  0.63   0.25
like image 201
Coldchain9 Avatar asked Jan 08 '21 15:01

Coldchain9


People also ask

How do you get percentage on Groupby pandas?

You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

How does pandas count sum and group by?

Use DataFrame. groupby(). sum() to group rows based on one or multiple columns and calculate sum agg function. groupby() function returns a DataFrameGroupBy object which contains an aggregate function sum() to calculate a sum of a given column for each group.

How do you find the percentage of a total in Python?

To calculate a percentage in Python, use the division operator (/) to get the quotient from two numbers and then multiply this quotient by 100 using the multiplication operator (*) to get the percentage. This is a simple equation in mathematics to get the percentage.

What does sum () do in pandas?

Pandas DataFrame sum() Method The sum() method adds all values in each column and returns the sum for each column. By specifying the column axis ( axis='columns' ), the sum() method searches column-wise and returns the sum of each row.


1 Answers

You can group the dataframe on Year and aggregate using sum:

s1 = df.groupby('Year').sum()
s2 = df.query("Ind == 'A'").groupby('Year').sum()

s2.div(s1).round(2).add_suffix('Rate')

      XRate  YRate  ZRate
Year                     
2011   0.20   0.29   0.33
2012   0.47   0.62   0.25
like image 52
Shubham Sharma Avatar answered Oct 18 '22 05:10

Shubham Sharma