Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing the first element of each group by its aggregation function

Tags:

python

pandas

Suppose the following dataframe:

df = pd.DataFrame(
    {'X': ['a', 'a', 'b', 'a', 'b'],
     'Y': [2, 4, 8, 10, 5]})

which looks as:

    X   Y
0   a   2
1   a   4
2   b   8
3   a   10
4   b   5

How to replace the first element of each group by X with the respective mean?

The expected output:

    X   Y
0   a   5.33
1   a   4.00
2   b   6.50
3   a   10.00
4   b   5.00

Sorry if this is a too basic question, but I am a newbie to Python (beginning its learning).

like image 376
PaulS Avatar asked Mar 16 '26 03:03

PaulS


1 Answers

Use GroupBy.transform for averages and set only first value per group in numpy.where with mask by Series.duplicated:

df['Y'] = np.where(df.X.duplicated(),df.Y,df.groupby("X")['Y'].transform('mean'))
print (df)
   X          Y
0  a   5.333333
1  a   4.000000
2  b   6.500000
3  a  10.000000
4  b   5.000000
    

Another solution with DataFrame.loc:

df.loc[~df.X.duplicated(), 'Y'] = df.groupby("X")['Y'].transform('mean')
like image 163
jezrael Avatar answered Mar 17 '26 15:03

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!