Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count number of categories after GROUP BY in SQL or Pandas

I have a dataframe df :

ORDERID    PRODUCTTYPE  PRODUCTID  PRODUCT
123         Fruits         2       Banana
123         Vegetables     3       Tomato
123         Vegetables     3       Onion
321         Fruits         2       Grapes
321         Fruits         2       Avocado

I need output as

ORDERID  FRUITS  VEGETABLES
123       1          2
321       2          0

Is there any modification i need to perform to the groupby ? I'm doing

df.groupby('ORDERID'['PRODUCTTYPE'].nunique().reset_index(name="count")

but thats just counting the number of categories

like image 304
gforce91 Avatar asked Jan 26 '23 06:01

gforce91


1 Answers

A combination of groupby().value_counts() and unstack():

(df.groupby('ORDERID')['PRODUCTTYPE']
   .value_counts()
   .unstack('PRODUCTTYPE', fill_value=0)
)

Or pivot_table:

df.pivot_table(index='ORDERID', 
               columns='PRODUCTTYPE', 
               values='PRODUCTID', 
               aggfunc='count',
               fill_value=0)

Output:

PRODUCTTYPE  Fruits  Vegetables
ORDERID                        
123               1           2
321               2           0
like image 164
Quang Hoang Avatar answered Feb 03 '23 06:02

Quang Hoang