Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove non-business days rows from pandas dataframe

Tags:

python

pandas

I have a dataframe with a timeseries data of wheat in df.

df = wt["WHEAT_USD"]

2016-05-02 02:00:00+02:00    4.780
2016-05-02 02:01:00+02:00    4.777
2016-05-02 02:02:00+02:00    4.780
2016-05-02 02:03:00+02:00    4.780
2016-05-02 02:04:00+02:00    4.780
Name: closeAsk, dtype: float64

When I plot the data it has these annoying horizontal lines because of weekends. Is there a simple way of removing the non-business days from the dataframe itself?

Something like

df = df.BDays()
like image 273
vandelay Avatar asked Jun 14 '16 04:06

vandelay


People also ask

How do you get rid of unwanted rows in Pandas?

You can delete a list of rows from Pandas by passing the list of indices to the drop() method. In this code, [5,6] is the index of the rows you want to delete. axis=0 denotes that rows should be deleted from the dataframe.

How do I delete non unique rows in Pandas?

You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows. For E.x, df. drop_duplicates(keep=False) .

How do you subtract days in Pandas?

When the function receives the date string it will first use the Pandas to_datetime() function to convert it to a Python datetime and it will then use the timedelta() function to subtract the number of days defined in the days variable.


3 Answers

One simple solution is to slice out the days not in Monday to Friday:

In [11]: s[s.index.dayofweek < 5] Out[11]: 2016-05-02 00:00:00    4.780 2016-05-02 00:01:00    4.777 2016-05-02 00:02:00    4.780 2016-05-02 00:03:00    4.780 2016-05-02 00:04:00    4.780 Name: closeAsk, dtype: float64 

Note: this doesn't take into account bank holidays etc.

like image 75
Andy Hayden Avatar answered Oct 05 '22 23:10

Andy Hayden


Pandas BDay just ends up using .dayofweek<5 like the chosen answer, but can be extended to account for bank holidays, etc.

import pandas as pd
from pandas.tseries.offsets import BDay

isBusinessDay = BDay().onOffset
csv_path = 'C:\\Python27\\Lib\\site-packages\\bokeh\\sampledata\\daylight_warsaw_2013.csv'
dates_df = pd.read_csv(csv_path)
match_series = pd.to_datetime(dates_df['Date']).map(isBusinessDay)
dates_df[match_series]
like image 29
Dave Babbitt Avatar answered Oct 05 '22 22:10

Dave Babbitt


I am building a backtester for stock/FX trading and I also have these issue with days that are nan because that they are holidays or other non trading days.. you can download a financial calendar for the days that there is no trading and then you need to think about timezone and weekends.. etc..

But the best solution is not to use date/time as the index for the candles or price. So do not connect your price data to a date/time but just to a counter of candles or prices .. you can use a second index for this.. so for calculations of MA or other technical lines dont use date/time .. if you look at Metatrader 4/5 it also doesnt use date/time but the index of the data is the candle number !!

I think that you need to let go of the date-time for the price if you work with stock or FX data , of cause you can put them in a column of the data-frame but dont use it as the index This way you can avoid many problems

like image 24
Henri Frits maarseveen Avatar answered Oct 05 '22 23:10

Henri Frits maarseveen