Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Having Trouble with multiple "groupby" with a variable and a category (binned data)

df.dtypes

Close       float64
eqId          int64
date         object
IntDate       int64
expiry        int64
delta         int64
ivMid       float64
conf        float64
Skew        float64
psc         float64
vol_B      category
dtype: object

gb = df.groupby([df['vol_B'],df['expiry']])

gb.describe()

I get a long error message with the final line being

AttributeError: 'Categorical' object has no attribute 'flags'

When I perform a groupby on each of them separately they each (independently) work great, I just can not perform multiple groupby with one of the variables being a "bin."

Also, when I use 2 other variables I am able to perform multiple groupby &ndash I successfully performed this:

gb = df.groupby([df['delta'],df['expiry']])
like image 944
John Avatar asked Nov 22 '25 05:11

John


1 Answers

I was facing a similar issue as the OP and found this question while looking for solutions. A simple hack that worked for me after going through the pandas documentation for categorical variables was to change the type of the categorical variable before grouping.

Since vol_B is the categorical variable in your case, you should try the following

#Depending on the content of vol_B you can do astype(int) or astype(float) as well.
gb = df.groupby([df['vol_B'].astype(str), df['expiry']])

I haven't gone into the details of why this works and that doesn't but if I get into it, I will update the answer.

like image 60
krypto07 Avatar answered Nov 24 '25 22:11

krypto07



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!