Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas groupby aggregation

I have a DataFrame df, composed of (age, height). I want to see how the mean of height changes with age, so I group df by age and try to form a new DataFrame new_df, composed of (age, mean_height), code goes below:

groups = df.groupby('age')
new_df = groups.agg({'height' : np.mean,
                     'age' : # HOW to add age?})

but I don't know how to append age to new_df, hope anyone could give me some advice.

like image 437
Alcott Avatar asked Apr 09 '26 09:04

Alcott


1 Answers

Age is the index of the aggregated dataframe:

In [95]: df = DataFrame({'age':[10,10,20,20,20], 'height':[140,150,145, 190,200]})

In [96]: df
Out[96]: 
   age  height
0   10     140
1   10     150
2   20     145
3   20     190
4   20     200

In [97]: groups = df.groupby('age')

In [98]: groups.agg({'height':np.mean})
Out[98]: 
         height
age            
10   145.000000
20   178.333333

And df.groupby('age').mean() would achieve the same result. If you want it as a column and not an index, add a call to reset_index().

As an alternative, you can call the groupby with as_index=False:

groups = df.groupby('age', as_index=False)
groups.agg({'heigt': np.mean})
like image 106
Korem Avatar answered Apr 12 '26 00:04

Korem



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!