Pandas Dataframe - Droping Certain Hours of the Day from 20 Years of Historical Data

Question

I have stock market data for a single security going back 20 years. The data is currently in an Pandas DataFrame, in the following format:

enter image description here

The problem is, I do not want any "after hours" trading data in my DataFrame. The market in question is open from 9:30AM to 4PM (09:30 to 16:00 on each trading day). I would like to drop all rows of data that are not within this time frame.

My instinct is to use a Pandas mask, which I know how to do if I wanted certain hours in a single day:

mask = (df['date'] > '2015-07-06 09:30:0') & (df['date'] <= '2015-07-06 16:00:0')
sub = df.loc[mask]

However, I have no idea how to use one on a revolving basis to remove the data for certain times of day over a 20 year period.

jorijnsmit · Accepted Answer

I think the answer is already in the comments (@Parfait's .between_time) but that it got lost in debugging issues. It appears your df['date'] column is not of type Datetime yet.

This should be enough to fix that and get the required result:

df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')
df = df.between_time('9:30', '16:00')

Bhavesh Ghodasara · Answer

Problem here is how you are importing data. There is no indicator whether 04:00 is am or pm? but based on your comments we need to assume it is PM. However input is showing it as AM.

To solve this we need to include two conditions with OR clause.

9:30-11:59
0:00-4:00

Input:

df = pd.DataFrame({'date':   {880551: '2015-07-06 04:00:00', 880552: '2015-07-06 04:02:00',880553: '2015-07-06 04:03:00', 880554: '2015-07-06 04:04:00', 880555: '2015-07-06 04:05:00'},
                   'open':   {880551: 125.00, 880552: 125.36,880553: 125.34, 880554: 125.08, 880555: 125.12},
                   'high':   {880551: 125.00, 880552: 125.36,880553: 125.34, 880554: 125.11, 880555: 125.12},
                   'low':    {880551: 125.00, 880552: 125.32,880553: 125.21, 880554: 125.05, 880555: 125.12},
                   'close':  {880551: 125.00, 880552: 125.32,880553: 125.21, 880554: 125.05, 880555: 125.12},
                   'volume': {880551: 141, 880552: 200,880553: 750, 880554: 17451, 880555: 1000},
                   },
                   )


df.head()

    date    open    high    low close   volume
880551  2015-07-06 04:00:00 125.00  125.00  125.00  125.00  141
880552  2015-07-06 04:02:00 125.36  125.36  125.32  125.32  200
880553  2015-07-06 04:03:00 125.34  125.34  125.21  125.21  750
880554  2015-07-06 04:04:00 125.08  125.11  125.05  125.05  17451
880555  2015-07-06 04:05:00 125.12  125.12  125.12  125.12  1000

from datetime import time

start_first = time(9, 30)
end_first = time(11, 59)
start_second = time(0, 00)
end_second = time(4,00)
df['date'] = pd.to_datetime(df['date'])
df= df[(df['date'].dt.time.between(start_first, end_first)) | (df['date'].dt.time.between(start_second, end_second))]
df
date    open    high    low close   volume
880551  2015-07-06 04:00:00 125.0   125.0   125.0   125.0   141

Above is not good practice, and I strongly discourage to use this kind of ambiguous data. long time solution is to correctly populate data with am/pm.

We can achieve it in two way in case of correct data format:

1) using datetime

from datetime import time

start = time(9, 30)
end = time(16)
df['date'] = pd.to_datetime(df['date'])
df= df[df['date'].dt.time.between(start, end)]

2) using between time, which only works with datetime index

df['date'] = pd.to_datetime(df['date'])

df = (df.set_index('date')
          .between_time('09:30', '16:00')
          .reset_index())

If you still face error, edit your question with line by line approach and exact error.

Pandas Dataframe - Droping Certain Hours of the Day from 20 Years of Historical Data

Tags:

python

pandas

dataframe

numpy

HMLDude

Video Answer

2 Answers

jorijnsmit

Bhavesh Ghodasara

Recent Activity

Donate For Us

Pandas Dataframe - Droping Certain Hours of the Day from 20 Years of Historical Data

Tags:

python

pandas

dataframe

numpy

HMLDude

Video Answer

2 Answers

jorijnsmit

Bhavesh Ghodasara

Related questions

Recent Activity

Donate For Us