Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas Group by date using datetime data

I have a column Date_Time that I wish to groupby date time without creating a new column. Is this possible the current code I have does not work.

df = pd.groupby(df,by=[df['Date_Time'].date()]) 
like image 887
GoBlue_MathMan Avatar asked Sep 08 '16 20:09

GoBlue_MathMan


People also ask

What does DF Groupby (' year ') do?

groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names.

How do I sort datetime in Pandas?

One thing to notice here is our DataFrame gets sorted in ascending order of dates, to sort the DataFrame in descending order we can pass an additional parameter inside the sort_values() function that will set ascending value to False and will return the DataFrame in descending order.


2 Answers

You can use groupby by dates of column Date_Time by dt.date:

df = df.groupby([df['Date_Time'].dt.date]).mean() 

Sample:

df = pd.DataFrame({'Date_Time': pd.date_range('10/1/2001 10:00:00', periods=3, freq='10H'),                    'B':[4,5,6]})  print (df)    B           Date_Time 0  4 2001-10-01 10:00:00 1  5 2001-10-01 20:00:00 2  6 2001-10-02 06:00:00  print (df['Date_Time'].dt.date) 0    2001-10-01 1    2001-10-01 2    2001-10-02 Name: Date_Time, dtype: object  df = df.groupby([df['Date_Time'].dt.date])['B'].mean() print(df) Date_Time 2001-10-01    4.5 2001-10-02    6.0 Name: B, dtype: float64 

Another solution with resample:

df = df.set_index('Date_Time').resample('D')['B'].mean()  print(df) Date_Time 2001-10-01    4.5 2001-10-02    6.0 Freq: D, Name: B, dtype: float64 
like image 84
jezrael Avatar answered Oct 06 '22 05:10

jezrael


resample

df.resample('D', on='Date_Time').mean()                B Date_Time       2001-10-01  4.5 2001-10-02  6.0 

Grouper

As suggested by @JosephCottam

df.set_index('Date_Time').groupby(pd.Grouper(freq='D')).mean()                B Date_Time       2001-10-01  4.5 2001-10-02  6.0 

Deprecated uses of TimeGrouper

You can set the index to be 'Date_Time' and use pd.TimeGrouper

df.set_index('Date_Time').groupby(pd.TimeGrouper('D')).mean().dropna()                B Date_Time       2001-10-01  4.5 2001-10-02  6.0 
like image 36
piRSquared Avatar answered Oct 06 '22 07:10

piRSquared