I have summarized the data frame of col1 col2 col3 count, Add different weightage on that count
Data set is like
# Current result
col1 col2 col3 Count Weightage_count
---------------------------------------------
1: A S1 X110 2 2
2: A S1 X150 2 0.5
3: A S2 X212 2 1
4: A S2 X200 1 0.5
5: A S2 X211 1 0.25
6: B S3 X311 4 4
7: C S4 X222 3 1.5
data = {'Col1':['A','A','A','A','A','B','C'],
'Col2':['S1','S1','S2','S2','S2','S3','S4'],
'Col3':['X110','X150','X212','X200','X211','X311','X222'],
'Count': [2,2,2,1,1,4,3],
'Weightage_count':[2, 0.5, 1, 0.5, 0.25, 4, 1.5]}
df = pd.DataFrame(data)
Want to calculate results based on col1 and col2.
Expected result.
Col1 Col2 Result
-------------------
1 A S1 0.625
2 A S2 0.5
3 B S3 1
4 C S4 0.5
First aggregate sum and then multiple columns in DataFrame.eval:
df = (df.groupby(['Col1','Col2'])
.sum()
.eval('Weightage_count / Count')
.reset_index(name='Result'))
print (df)
Col1 Col2 Result
0 A S1 0.6250
1 A S2 0.4375
2 B S3 1.0000
3 C S4 0.5000
Or divide by Series.div with DataFrame.pop for remove columns after processing:
df = df.groupby(['Col1','Col2'], as_index=False)[['Count','Weightage_count']].sum()
df['new'] = df.pop('Weightage_count').div(df.pop('Count'))
print (df)
Col1 Col2 new
0 A S1 0.6250
1 A S2 0.4375
2 B S3 1.0000
3 C S4 0.5000
If need also columns:
df = df.groupby(['Col1','Col2'])[['Count','Weightage_count']].sum()
df['new'] = df['Weightage_count'].div(df['Count'])
print (df)
Count Weightage_count new
Col1 Col2
A S1 4 2.50 0.6250
S2 4 1.75 0.4375
B S3 4 4.00 1.0000
C S4 3 1.50 0.5000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With