Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modify output from Python Pandas describe

Tags:

Is there a way to omit some of the output from the pandas describe? This command gives me exactly what I want with a table output (count and mean of executeTime's by a simpleDate)

df.groupby('simpleDate').executeTime.describe().unstack(1) 

However that's all I want, count and mean. I want to drop std, min, max, etc... So far I've only read how to modify column size.

I'm guessing the answer is going to be to re-write the line, not using describe, but I haven't had any luck grouping by simpleDate and getting the count with a mean on executeTime.

I can do count by date:

df.groupby(['simpleDate']).size() 

or executeTime by date:

df.groupby(['simpleDate']).mean()['executeTime'].reset_index() 

But can't figure out the syntax to combine them.

My desired output:

            count  mean   09-10-2013      8  20.523    09-11-2013      4  21.112   09-12-2013      3  18.531 ...            ..  ... 
like image 505
KHibma Avatar asked Oct 01 '13 19:10

KHibma


2 Answers

.describe() attribute generates a dataframe where count,std,max... are values of the index, so according to the documentation you should use, for example:

df.describe().loc[['count','max']] 
like image 178
Rafa Avatar answered Oct 18 '22 21:10

Rafa


Describe returns a series, so you can just select out what you want

In [6]: s = Series(np.random.rand(10))  In [7]: s Out[7]:  0    0.302041 1    0.353838 2    0.421416 3    0.174497 4    0.600932 5    0.871461 6    0.116874 7    0.233738 8    0.859147 9    0.145515 dtype: float64  In [8]: s.describe() Out[8]:  count    10.000000 mean      0.407946 std       0.280562 min       0.116874 25%       0.189307 50%       0.327940 75%       0.556053 max       0.871461 dtype: float64  In [9]: s.describe()[['count','mean']] Out[9]:  count    10.000000 mean      0.407946 dtype: float64 
like image 42
Jeff Avatar answered Oct 18 '22 19:10

Jeff