Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort value per group in already grouped dataframe

There are many answers to sorting per group when performing groupby operation however in my case I groupby then use aggregated columns to create another column. I want to sort by newly created column per group.

MRE:

df = pd.DataFrame({"A":[1,1,1,3,3,3, 1,1,1,3,3,3],
                   "B":["a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"],
                   "click":[100, 200, 123, 333, 222, 333, 100, 200, 123, 333, 222, 333],
                   "exp":[10000, 10000, 10000, 10000, 10000, 10000, 20000, 20000, 20000, 20000, 20000, 20000]})

grp_df = df.groupby(["A", "B"]).sum()
grp_df["ctr"] = grp_df["click"] / grp_df["exp"] * 100

outputs:

        click   exp     ctr
A   B           
1   a   200     30000   0.666667
    b   400     30000   1.333333
    c   246     30000   0.820000
3   a   666     30000   2.220000
    b   444     30000   1.480000
    c   666     30000   2.220000

Desired output:

        click   exp     ctr
A   B           
1   b   400     30000   1.333333
    c   246     30000   0.820000
    a   200     30000   0.666667
3   a   666     30000   2.220000
    c   666     30000   2.220000
    b   444     30000   1.480000
like image 336
haneulkim Avatar asked Jan 31 '26 07:01

haneulkim


2 Answers

You can groupby A (level=0) and then sort_values by ctr column:

grp_df.groupby(level=0).apply(
  lambda g: g.sort_values('ctr', ascending=False)
).reset_index(level=0, drop=True)

     click    exp       ctr
A B                        
1 b    400  30000  1.333333
  c    246  30000  0.820000
  a    200  30000  0.666667
3 a    666  30000  2.220000
  c    666  30000  2.220000
  b    444  30000  1.480000

Or as @haneulkim commented, a more concise option would be with group_keys=False:

grp_df.groupby(level=0, group_keys=False).apply(
  lambda g: g.sort_values('ctr', ascending=False))
like image 136
Psidom Avatar answered Feb 03 '26 08:02

Psidom


You could use sort_values, and pair the grouper with the new column:

 grp_df.sort_values(['A', 'ctr'], ascending = [True, False])

     click    exp       ctr
A B
1 b    400  30000  1.333333
  c    246  30000  0.820000
  a    200  30000  0.666667
3 a    666  30000  2.220000
  c    666  30000  2.220000
  b    444  30000  1.480000
like image 20
sammywemmy Avatar answered Feb 03 '26 09:02

sammywemmy



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!