Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Selecting DataFrame rows between two dates (Datetime Index)

I have a Pandas DataFrame with a DatetimeIndex and one column MSE Loss the index is formatted as follows:

DatetimeIndex(['2015-07-16 07:14:41', '2015-07-16 07:14:48',
           '2015-07-16 07:14:54', '2015-07-16 07:15:01',
           '2015-07-16 07:15:07', '2015-07-16 07:15:14',...]

It includes several days.

I want to select all the rows (all times) of a particular days without specifically knowing the actual time intervals. For example: Between 2015-07-16 07:00:00 and 2015-07-16 23:00:00

I tried the approach outlined here: here

But df[date_from:date_to]

outputs:

KeyError: Timestamp('2015-07-16 07:00:00')

So it wants exact indices. Furthermore, I don't have a datecolumn. Only an index with the dates.

What is the best way to select a whole day by just providing a date 2015-07-16 and then how could I select a specific time range within a particular day?

like image 615
user3142067 Avatar asked Jun 14 '17 14:06

user3142067


People also ask

How do I get the data between two dates in Python?

Python comes with an inbuilt datetime module that helps us to solve various datetime related problems. In order to find the difference between two dates we simply input the two dates with date type and subtract them, which in turn provides us the number of days between the two dates.

How do I set a date range in pandas?

date_range() function to create a date range in pandas. This function uses the following basic syntax: pandas. date_range(start, end, periods, freq, …)

How do I select a specific index in pandas?

If you'd like to select rows based on integer indexing, you can use the . iloc function. If you'd like to select rows based on label indexing, you can use the . loc function.


1 Answers

Option 1:

Sample df:

df
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

It looks like you're trying this without .loc (won't work without it):

df.loc['2015-07-16 07:00:00':'2015-07-16 23:00:00']
                      a
2015-07-16 07:14:41  12
2015-07-16 07:14:48  34
2015-07-16 07:14:54  65
2015-07-16 07:15:01  34
2015-07-16 07:15:07  23
2015-07-16 07:15:14   1

Option 2:

You can use boolean indexing on the index:

df[(df.index.get_level_values(0) >= '2015-07-16 07:00:00') & (df.index.get_level_values(0) <= '2015-07-16 23:00:00')]
like image 114
Andrew L Avatar answered Oct 11 '22 03:10

Andrew L