Replacing the first element of each group by its aggregation function

Question

Suppose the following dataframe:

df = pd.DataFrame(
    {'X': ['a', 'a', 'b', 'a', 'b'],
     'Y': [2, 4, 8, 10, 5]})

which looks as:

How to replace the first element of each group by X with the respective mean?

The expected output:

    X   Y
0   a   5.33
1   a   4.00
2   b   6.50
3   a   10.00
4   b   5.00

Sorry if this is a too basic question, but I am a newbie to Python (beginning its learning).

jezrael · Accepted Answer

Use GroupBy.transform for averages and set only first value per group in numpy.where with mask by Series.duplicated:

df['Y'] = np.where(df.X.duplicated(),df.Y,df.groupby("X")['Y'].transform('mean'))
print (df)
   X          Y
0  a   5.333333
1  a   4.000000
2  b   6.500000
3  a  10.000000
4  b   5.000000

Another solution with DataFrame.loc:

df.loc[~df.X.duplicated(), 'Y'] = df.groupby("X")['Y'].transform('mean')

Replacing the first element of each group by its aggregation function

Tags:

python

pandas

PaulS

1 Answers

jezrael

Recent Activity

Donate For Us

Replacing the first element of each group by its aggregation function

Tags:

python

pandas

PaulS

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us