I'm trying to pass a user defined function pct
to Pandas agg
method, and it works if I only pass that function but it doesn't when I use the dictionary format for defining the functions. Does anyone know why?
import pandas as pd
df = pd.DataFrame([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
columns=['A', 'B', 'C'])
pct = lambda x: len(x)/len(df)
df.groupby('A').agg(pct)
returns as expected
B C
A
1 0.333333 0.333333
4 0.333333 0.333333
7 0.333333 0.333333
But
aggs = {'B':['pct']}
df.groupby('A').agg(aggs)
returns the following error:
AttributeError: 'SeriesGroupBy' object has no attribute 'pct'
Group Series using a mapper or by a Series of columns. A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups. Parameters bymapping, function, label, or list of labels.
In this article, we'll see how we can display all the values of each group in which a dataframe is divided. The dataframe is first divided into groups using the DataFrame. groupby() method. Then we modify it such that each group contains the values in a list.
AttributeError: 'SeriesGroupBy' object has no attribute 'pct' Your function is pct not 'pct', which is a string which .agg () won't know how to associate with a function.
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
We receive an error because we wrote the word dataframe in lowercase. To create a pandas DataFrame, we must write the word ‘DataFrame’ in camel-case: Notice that we’re able to successfully create the DataFrame without any errors. We might also receive this error if some other variable in our script is named ‘pd’ or ‘pandas’:
If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series’ values are first aligned; see .align () method). If an ndarray is passed, the values are used as-is to determine the groups.
There is string 'pct'
, need variable pct
- lambda function by removing ''
:
aggs = {'B':pct}
print(df.groupby('A').agg(aggs))
B
A
1 0.333333
4 0.333333
7 0.333333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With