I tried to calculate specific quantile values from a data frame, as shown in the code below. There was no problem when calculate it in separate lines.
When attempting to run last 2 lines, I get the following error:
AttributeError: 'SeriesGroupBy' object has no attribute 'quantile(0.25)'
How can I fix this?
import pandas as pd
df = pd.DataFrame(
{
'x': [0, 1, 0, 1, 0, 1, 0, 1],
'y': [7, 6, 5, 4, 3, 2, 1, 0],
'number': [25000, 35000, 45000, 50000, 60000, 70000, 65000, 36000]
}
)
f = {'number': ['median', 'std', 'quantile']}
df1 = df.groupby('x').agg(f)
df.groupby('x').quantile(0.25)
df.groupby('x').quantile(0.75)
# code below with problem:
f = {'number': ['median', 'std', 'quantile(0.25)', 'quantile(0.75)']}
df1 = df.groupby('x').agg(f)
I prefer def functions
def q1(x):
return x.quantile(0.25)
def q3(x):
return x.quantile(0.75)
f = {'number': ['median', 'std', q1, q3]}
df1 = df.groupby('x').agg(f)
df1
Out[1643]:
number
median std q1 q3
x
0 52500 17969.882211 40000 61250
1 43000 16337.584481 35750 55000
@WeNYoBen's answer is great. There is one limitation though, and that lies with the fact that one needs to create a new function for every quantile. This can be a very unpythonic exercise if the number of quantiles become large. A better approach is to use a function to create a function, and to rename that function appropriately.
def rename(newname):
def decorator(f):
f.__name__ = newname
return f
return decorator
def q_at(y):
@rename(f'q{y:0.2f}')
def q(x):
return x.quantile(y)
return q
f = {'number': ['median', 'std', q_at(0.25) ,q_at(0.75)]}
df1 = df.groupby('x').agg(f)
df1
Out[]:
number
median std q0.25 q0.75
x
0 52500 17969.882211 40000 61250
1 43000 16337.584481 35750 55000
The rename decorator renames the function so that the pandas agg function can deal with the reuse of the quantile function returned (otherwise all quantiles results end up in columns that are named q).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With