I have data which has a categorical column that groups the data and other columns likes this in a dataframe df
.
id subid value
1 10 1.5
1 20 2.5
1 30 7.0
2 10 12.5
2 40 5
What I need is a column that has the average value for each subid
within each id
. For example df
could be:
id subid value id_sum proportion
1 10 1.5 11.0 0.136
1 20 2.5 11.0 0.227
1 30 7.0 11.0 0.636
2 10 12.5 17.5 0.714
2 40 5 17.5 0.285
Now, I tried getting the id_sum column by doing:
df['id_sum'] = df.groupby('id')['value'].sum()
But this does not seem to work as hoped. My end goal is to get the proportion
column. What is the correct way of getting that?
here we go
df['id_sum'] = df.groupby('id')['value'].transform('sum')
df['proportion'] = df['value'] / df['id_sum']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With