I would like to add a rescaled column to this dataframe:
I,Value
A,1
A,4
A,2
A,5
B,1
B,2
B,1
so that the new column (let's call it scale), follows a function over the value column for each group of I. The function is just a normalization over the range for each group:
lambda x: (x-min(x))/(max(x)-min(x))
So far I tried:
d = df.groupby('I').apply(lambda x: (x-min(x))/(max(x)-min(x)))
receiving the following TypeError:
TypeError: Could not operate array(['A'], dtype=object) with block values index 1 is out of bounds for axis 1 with size 1
                If you added the 'Value' column to your code then it would work:
In [69]:
df.groupby('I')['Value'].apply(lambda x: (x-min(x))/(max(x)-min(x)))
Out[69]:
0    0.00
1    0.75
2    0.25
3    1.00
4    0.00
5    1.00
6    0.00
dtype: float64
The pandas method version is the following which produces the same result:
In [67]:
df['Normalised'] = df.groupby('I')['Value'].apply(lambda x: (x-x.min())/(x.max()-x.min()))
df
Out[67]:
   I  Value  Normalised
0  A      1        0.00
1  A      4        0.75
2  A      2        0.25
3  A      5        1.00
4  B      1        0.00
5  B      2        1.00
6  B      1        0.00
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With