Hi Hoping to get some help, I have two columns Dataframe df
as;
Source ID
1 2
2 3
1 2
1 2
1 3
3 1
My intention is to group the Source and divide the ID cell by total based on the grouped Source and attach this to the orginial dataframe so the new column would look like;
Source ID ID_new
1 2 2/9
2 3 3/3
1 2 2/9
1 2 2/9
1 3 3/9
3 1 3/1
I've gotten as far as;
df.groupby('Source ID')['ID'].sum()
to get the total for ID
but Im not sure where to go next.
try this:
In [79]: df.assign(ID_new=df.ID/df.groupby('Source').ID.transform('sum'))
Out[79]:
Source ID ID_new
0 1 2 0.222222
1 2 3 1.000000
2 1 2 0.222222
3 1 2 0.222222
4 1 3 0.333333
5 3 1 1.000000
if you need it as a new persistent column you can do it as @jezrael proposed in the comment:
In [81]: df['ID_new'] = df.ID/df.groupby('Source').ID.transform('sum')
In [82]: df
Out[82]:
Source ID ID_new
0 1 2 0.222222
1 2 3 1.000000
2 1 2 0.222222
3 1 2 0.222222
4 1 3 0.333333
5 3 1 1.000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With