Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Create Range of Dates Without Weekends

Given the following data frame:

import pandas as pd
df=pd.DataFrame({'A':['a','b','c'],
        'first_date':['2015-08-31 00:00:00','2015-08-24 00:00:00','2015-08-25 00:00:00']})
df.first_date=pd.to_datetime(df.first_date) #(dtype='<M8[ns]')
df['last_date']=pd.to_datetime('5/6/2016') #(dtype='datetime64[ns]')
df

    A   first_date   last_date
0   a   2015-08-31  2016-05-06
1   b   2015-08-24  2016-05-06
2   c   2015-08-25  2016-05-06

I'd like to create a new column which contains the list (or array) of dates between 'first_date' and 'last_date' which excludes weekends.

So far, I've tried this:

pd.date_range(df['first_date'],df['last_date'])

...but this error occurs:

TypeError: Cannot convert input to Timestamp

I also tried this before pd.date_range...

pd.Timestamp(df['first_date'])

...but no dice.

Thanks in advance!

P.S.:

After this hurdle, I'm going to try looking at other lists of dates and if they fall within the generated array (per row in 'A'), then subtract them out of the list or array). I'll post it as a separate question.

like image 866
Dance Party Avatar asked Jan 06 '23 22:01

Dance Party


1 Answers

freq='B' gives you business days, or no weekends.

Your error:

TypeError: Cannot convert input to Timestamp

Is the result of you passing a series to the pd.date_range function when it is expecting a Timestamp

Instead, use apply.

However, I still find it tricky to get lists into specific cells of dataframes. The way I use is to use a pd.Series([mylist]). Notice it is a list of a list. If it were just pd.Series(mylist) pandas would convert the list into a series and you'd get a series of series which is a dataframe.

try:

def fnl(x):
    l = pd.date_range(x.loc['first_date'], x.loc['last_date'], freq='B')
    return pd.Series([l])

df['range'] = df.apply(fnl, axis=1)
like image 95
piRSquared Avatar answered Feb 02 '23 09:02

piRSquared