Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Pandas groupby() + apply() with arguments

I would like to use df.groupby() in combination with apply() to apply a function to each row per group.

I normally use the following code, which usually works (note, that this is without groupby()):

df.apply(myFunction, args=(arg1,)) 

With the groupby() I tried the following:

df.groupby('columnName').apply(myFunction, args=(arg1,)) 

However, I get the following error:

TypeError: myFunction() got an unexpected keyword argument 'args'

Hence, my question is: How can I use groupby() and apply() with a function that needs arguments?

like image 519
beta Avatar asked Apr 18 '17 22:04

beta


People also ask

What is possible using Groupby () method of pandas?

groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.

How do you use Groupby in pandas example?

The Hello, World! of pandas GroupBy groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to . groupby() as the first argument.

Can you use Groupby with multiple columns in pandas?

How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.

How to get group by in pandas Dataframe?

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby (), pandas.Series.groupby (), etc. Groupby iterator. Dict {group name -> group labels}. Dict {group name -> group indices}.

How to apply a group by a function with one argument?

Show activity on this post. Usually when using the .apply () method, one passes a function that takes exactly one argument. def somefunction (group): group ['ColumnC'] == group ['ColumnC']**2 return group df.groupby ( ['ColumnA', 'ColumnB']).apply (somefunction) Here somefunction is applied for each group, which is then returned.

What is groupby () in pandas?

In pandas perception, the groupby () process holds a classified number of parameters to control its operation. Syntax and Parameters of Pandas DataFrame.groupby ():

How do you pass a function to a group in Python?

Usually when using the .apply () method, one passes a function that takes exactly one argument. def somefunction (group): group ['ColumnC'] == group ['ColumnC']**2 return group df.groupby ( ['ColumnA', 'ColumnB']).apply (somefunction)


Video Answer


1 Answers

pandas.core.groupby.GroupBy.apply does NOT have named parameter args, but pandas.DataFrame.apply does have it.

So try this:

df.groupby('columnName').apply(lambda x: myFunction(x, arg1)) 

or as suggested by @Zero:

df.groupby('columnName').apply(myFunction, ('arg1')) 

Demo:

In [82]: df = pd.DataFrame(np.random.randint(5,size=(5,3)), columns=list('abc'))  In [83]: df Out[83]:    a  b  c 0  0  3  1 1  0  3  4 2  3  0  4 3  4  2  3 4  3  4  1  In [84]: def f(ser, n):     ...:     return ser.max() * n     ...:  In [85]: df.apply(f, args=(10,)) Out[85]: a    40 b    40 c    40 dtype: int64 

when using GroupBy.apply you can pass either a named arguments:

In [86]: df.groupby('a').apply(f, n=10) Out[86]:     a   b   c a 0   0  30  40 3  30  40  40 4  40  20  30 

a tuple of arguments:

In [87]: df.groupby('a').apply(f, (10)) Out[87]:     a   b   c a 0   0  30  40 3  30  40  40 4  40  20  30 
like image 139
MaxU - stop WAR against UA Avatar answered Sep 22 '22 07:09

MaxU - stop WAR against UA