I have data which has a categorical column that groups the data and other columns likes this in a dataframe df.
id      subid      value
1       10         1.5
1       20         2.5
1       30         7.0 
2       10         12.5
2       40         5
What I need is a column that has the average value for each subid within each id. For example df could be:
id      subid      value     id_sum    proportion
1       10         1.5       11.0      0.136
1       20         2.5       11.0      0.227
1       30         7.0       11.0      0.636
2       10         12.5      17.5      0.714
2       40         5         17.5      0.285
Now, I tried getting the id_sum column by doing:
df['id_sum'] = df.groupby('id')['value'].sum()
But this does not seem to work as hoped. My end goal is to get the proportion column. What is the correct way of getting that?
here we go
df['id_sum'] = df.groupby('id')['value'].transform('sum')
df['proportion'] = df['value'] / df['id_sum']
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With