I am trying to reduce data in a pandas dataframe by using different kind of functions and argument values. However, I did not manage to change the default arguments in the aggregation functions. Here is an example:
>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
... 'y': ['a','a','b','b']})
>>> df
x y
0 1.0 a
1 NaN a
2 2.0 b
3 1.0 b
Here is an aggregation function, for which I would like to test different values of b
:
>>> def translate_mean(x, b=10):
... y = [elem + b for elem in x]
... return np.mean(y)
In the following code, I can use this function with the default b
value, but I would like to pass other values:
>>> df.groupby('y').agg(translate_mean)
x
y
a NaN
b 11.5
Any ideas?
Just pass as arguments to agg
(this works with apply
, too).
df.groupby('y').agg(translate_mean, b=4)
Out:
x
y
a NaN
b 5.5
Just in case you have multiple columns, and you want to apply different functions and different parameters for each column, you can use lambda function with agg function. For example:
>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
... 'y': ['a','a','b','b']
'z': ['0.1','0.2','0.3','0.4']})
>>> df
x y z
0 1.0 a 0.1
1 NaN a 0.2
2 2.0 b 0.3
3 1.0 0.4
>>> def translate_mean(x, b=10):
... y = [elem + b for elem in x]
... return np.mean(y)
To groupby column 'y', and apply function translate_mean with b=10 for col 'x'; b=25 for col 'z', you can try this:
df_res = df.groupby(by='a').agg({
'x': lambda x: translate_mean(x, 10),
'z': lambda x: translate_mean(x, 25)})
Hopefully, it helps.
Maybe you can try using apply
in this case:
df.groupby('y').apply(lambda x: translate_mean(x['x'], 20))
Now the result is:
y
a NaN
b 21.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With