Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby and calculations based on specific row value

I have data that looks like this:

d = {'id' : [1, 1, 1, 2, 2, 2],
     'levels': ['low', 'perfect', 'high', 'low', 'perfect', 'high'],
     'value': [1, 10, 13, 2, 10, 13]}

df = pd.DataFrame(d, columns=['id', 'levels', 'value'])
df = df.groupby(['id','levels'])[['value']].mean()

For each [id, levels], I want to find the difference between the value of the row and the value of the perfect row. It would look like this:

id | levels | value | penalty
1  | high   | 13    | 3
   | low    | 1     | 9
   | perfect| 10    | 0
2  | high   | 13    | 3
   | low    | 2     | 8
   | perfect| 10    | 0

For example, in the first row, you would subtract 13 from the perfect value, which is 10, to get 3.

So how do I make a calculation where I find the perfect value for each [id, levels], and then find the differences?

like image 363
rdk Avatar asked Dec 03 '22 17:12

rdk


2 Answers

Select the cross section of dataframe using xs, then subtract this cross section from the given dataframe on level=0

df['penalty'] =  df['value'].sub(df['value'].xs('perfect', level=1)).abs()

            value  penalty
id levels                 
1  high        13        3
   low          1        9
   perfect     10        0
2  high        13        3
   low          2        8
   perfect     10        0
like image 199
Shubham Sharma Avatar answered Dec 06 '22 05:12

Shubham Sharma


You can try transform and then subtract and convert to absolute:

val = df.loc[df['levels'].eq('perfect').groupby(df['id']).transform('idxmax'),'value']
df['penalty'] = df['value'].sub(val.to_numpy()).abs()

print(df)

   id   levels  value  penalty
0   1      low      1        9
1   1  perfect     10        0
2   1     high     13        3
3   2      low      2        8
4   2  perfect     10        0
5   2     high     13        3
like image 33
anky Avatar answered Dec 06 '22 07:12

anky