Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Group by date and the median of another variable

Tags:

python

pandas

This is a demo example of my DataFrame. The full DataFrame has multiple additional variables and covers 6 months of data.

sentiment     date
1             2015-05-26 18:58:44
0.9           2015-05-26 19:57:31
0.7           2015-05-26 18:58:24
0.4           2015-05-27 19:17:34
0.6           2015-05-27 18:46:12
0.5           2015-05-27 13:32:24
1             2015-05-28 19:27:31
0.7           2015-05-28 18:58:44
0.2           2015-05-28 19:47:34

I want to group the DataFrame by just the day of the date column, but at the same time aggregate the median of the sentiment column.

Everything I have tried with groupby, the dt accessor and timegrouper has failed.

I want to return a pandas DataFrame not a GroupBy object.

The date column is M8[ns]

The sentiment column float64

like image 929
RDJ Avatar asked Jan 08 '16 15:01

RDJ


1 Answers

You fortunately have the tools you need listed in your question.

In [61]: df.groupby(df.date.dt.date)[['sentiment']].median()
Out[61]:
            sentiment
2015-05-26        0.9
2015-05-27        0.5
2015-05-28        0.7
like image 57
chrisaycock Avatar answered Nov 03 '22 02:11

chrisaycock