Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe groupby and sort

I have a dataframe that has 4 columns where the first two columns consist of strings (categorical variable) and the last two are numbers.

Type    Subtype    Price    Quantity
Car     Toyota     10       1
Car     Ford       50       2
Fruit   Banana     50       20
Fruit   Apple      20       5 
Fruit   Kiwi       30       50
Veggie  Pepper     10       20
Veggie  Mushroom   20       10
Veggie  Onion      20       3
Veggie  Beans      10       10  

How do I make it such that the dataframe is sorted in descending order based on the aggregated sum of Price on the column Type, and have the Subtype column sorted in descending order for the Price column as well? Like this:

Type    Subtype    Price    Quantity
Fruit   Banana     50       20
        Kiwi       30       50
        Apple      20       5 
Car     Ford       50       2
        Toyota     10       1
Veggie  Mushroom   20       10
        Onion      20       3
        Beans      10       10  
        Pepper     10       20

I tried the following but it did not sort the Subtype column in descending order:

df = df.groupby(['Type','Subtype'])['Price', 'Quantity'].agg({'Price':sum})
i = df.index.get_level_values(0)
df = df.iloc[i.reindex
                   (df['PRICE'].groupby(level=0, 
                   group_keys=False).sum().sort_values('PRICE', ascending=False).index)[1]]
df.columns = df.columns.get_level_values(1)

Edit: There are multiple items under Subtype that are the same so I would like both Type and Subtype columns grouped as well.

like image 633
user112947 Avatar asked Feb 11 '19 14:02

user112947


People also ask

Does pandas Groupby keep order?

Groupby preserves the order of rows within each group.

Is Groupby faster on index pandas?

Although Groupby is much faster than Pandas GroupBy. apply and GroupBy. transform with user-defined functions, Pandas is much faster with common functions like mean and sum because they are implemented in Cython. The speed differences are not small.

How do you get Groupby descending in pandas?

To group Pandas dataframe, we use groupby(). To sort grouped dataframe in descending order, use sort_values(). The size() method is used to get the dataframe size.

What does Group_by do in pandas?

What is the GroupBy function? Pandas' GroupBy is a powerful and versatile function in Python. It allows you to split your data into separate groups to perform computations for better analysis.


1 Answers

Try:

df.assign(sortkey = df.groupby('Type')['Price'].transform('sum'))\
  .sort_values(['sortkey','Type','Price'], ascending=[False,True,False])\
  .set_index(['Type','Subtype'])\
  .drop('sortkey', axis=1)

Output:

                 Price  Quantity
Type   Subtype                  
Fruit  Banana       50        20
       Kiwi         30        50
       Apple        20         5
Car    Ford         50         2
       Toyota       10         1
Veggie Mushroom     20        10
       Onion        20         3
       Pepper       10        20
       Beans        10        10
like image 154
Scott Boston Avatar answered Oct 13 '22 14:10

Scott Boston