I have a DataFrame as
Page Line y
1 2 3.2
1 2 6.1
1 3 7.1
2 4 8.5
2 4 9.1
I have to replace column y with values of its mean in groups. I can do that grouping using one column using this code.
df['y'] = df['y'].groupby(df['Page'], group_keys=False).transform('mean')
I am trying to replace the values of y by mean of groups by 'Page' and 'Line'. Something like this,
Page Line y
1 2 4.65
1 2 4.65
1 3 7.1
2 4 8.8
2 4 8.8
I have searched through a lot of answers on this site but couldn't find this application. Using python3 with pandas.
You need list of columns names, groupby
parameter by
:
by : mapping, function, label, or list of labels
Used to determine the groups for the groupby. If by is a function, it’s called on each value of the object’s index. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series’ values are first aligned; see .align() method). If an ndarray is passed, the values are used as-is determine the groups. A label or list of labels may be passed to group by the columns in self. Notice that a tuple is interpreted a (single) key.
df['y'] = df.groupby(['Page', 'Line'])['y'].transform('mean')
print (df)
Page Line y
0 1 2 4.65
1 1 2 4.65
2 1 3 7.10
3 2 4 8.80
4 2 4 8.80
Your solution should be changed to this syntactic sugar - pass Series in list:
df['y'] = df['y'].groupby([df['Page'], df['Line']]).transform('mean')
So you want this:
df['y'] = df.groupby(['Page', 'Line']).transform('mean')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With