I have a dataframe like this:
cluster  org      time    1      a       8    1      a       6    2      h       34    1      c       23    2      d       74    3      w       6    I would like to calculate the average of time per org per cluster.
Expected result:
cluster mean(time) 1       15 ((8+6)/2+23)/2 2       54   (74+34)/2 3       6   I do not know how to do it in Pandas, can anybody help?
Pandas Groupby Mean To get the average (or mean) value of in each group, you can directly apply the pandas mean() function to the selected columns from the result of pandas groupby.
Pandas comes with a whole host of sql-like aggregation functions you can apply when grouping on one or more columns.
To calculate the mean of whole columns in the DataFrame, use pandas. Series. mean() with a list of DataFrame columns. You can also get the mean for all numeric columns using DataFrame.
Pandas groupby is used for grouping the data according to the categories and apply a function to the categories. It also helps to aggregate data efficiently. Pandas dataframe. groupby() function is used to split the data into groups based on some criteria.
If you want to first take mean on the combination  of ['cluster', 'org'] and then take mean on cluster groups, you can use:
In [59]: (df.groupby(['cluster', 'org'], as_index=False).mean()             .groupby('cluster')['time'].mean()) Out[59]: cluster 1          15 2          54 3           6 Name: time, dtype: int64   If you want the mean of cluster groups only, then you can use:
In [58]: df.groupby(['cluster']).mean() Out[58]:               time cluster 1        12.333333 2        54.000000 3         6.000000   You can also use groupby on ['cluster', 'org'] and then use mean():
In [57]: df.groupby(['cluster', 'org']).mean() Out[57]:                time cluster org 1       a    438886         c        23 2       d      9874         h        34 3       w         6 
                        I would simply do this, which literally follows what your desired logic was:
df.groupby(['org']).mean().groupby(['cluster']).mean() 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With