Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: Passing arguments to a function in agg()

I am trying to reduce data in a pandas dataframe by using different kind of functions and argument values. However, I did not manage to change the default arguments in the aggregation functions. Here is an example:

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']})
>>> df
     x  y
0  1.0  a
1  NaN  a
2  2.0  b
3  1.0  b

Here is an aggregation function, for which I would like to test different values of b:

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

In the following code, I can use this function with the default b value, but I would like to pass other values:

>>> df.groupby('y').agg(translate_mean)
      x
y
a   NaN
b  11.5

Any ideas?

like image 631
Tanguy Avatar asked Jun 15 '17 21:06

Tanguy


3 Answers

Just pass as arguments to agg (this works with apply, too).

df.groupby('y').agg(translate_mean, b=4)
Out: 
     x
y     
a  NaN
b  5.5
like image 88
ayhan Avatar answered Nov 11 '22 19:11

ayhan


Just in case you have multiple columns, and you want to apply different functions and different parameters for each column, you can use lambda function with agg function. For example:

>>> df = pd.DataFrame({'x': [1,np.nan,2,1],
...                    'y': ['a','a','b','b']
                       'z': ['0.1','0.2','0.3','0.4']})
>>> df
     x  y  z
0  1.0  a  0.1
1  NaN  a  0.2
2  2.0  b  0.3
3  1.0     0.4

>>> def translate_mean(x, b=10):
...   y = [elem + b for elem in x]
...   return np.mean(y)

To groupby column 'y', and apply function translate_mean with b=10 for col 'x'; b=25 for col 'z', you can try this:

df_res = df.groupby(by='a').agg({
    'x': lambda x: translate_mean(x, 10),
    'z': lambda x: translate_mean(x, 25)})

Hopefully, it helps.

like image 21
Yunzhao Xing Avatar answered Nov 11 '22 19:11

Yunzhao Xing


Maybe you can try using apply in this case:

df.groupby('y').apply(lambda x: translate_mean(x['x'], 20))

Now the result is:

y
a     NaN
b    21.5
like image 36
Bubble Bubble Bubble Gut Avatar answered Nov 11 '22 17:11

Bubble Bubble Bubble Gut