Given the following data frame:
import pandas as pd
df=pd.DataFrame({'A':['a','b','c'],
'first_date':['2015-08-31 00:00:00','2015-08-24 00:00:00','2015-08-25 00:00:00']})
df.first_date=pd.to_datetime(df.first_date) #(dtype='<M8[ns]')
df['last_date']=pd.to_datetime('5/6/2016') #(dtype='datetime64[ns]')
df
A first_date last_date
0 a 2015-08-31 2016-05-06
1 b 2015-08-24 2016-05-06
2 c 2015-08-25 2016-05-06
I'd like to create a new column which contains the list (or array) of dates between 'first_date' and 'last_date' which excludes weekends.
So far, I've tried this:
pd.date_range(df['first_date'],df['last_date'])
...but this error occurs:
TypeError: Cannot convert input to Timestamp
I also tried this before pd.date_range...
pd.Timestamp(df['first_date'])
...but no dice.
Thanks in advance!
P.S.:
After this hurdle, I'm going to try looking at other lists of dates and if they fall within the generated array (per row in 'A'), then subtract them out of the list or array). I'll post it as a separate question.
freq='B'
gives you business days, or no weekends.
Your error:
TypeError: Cannot convert input to Timestamp
Is the result of you passing a series to the pd.date_range
function when it is expecting a Timestamp
Instead, use apply
.
However, I still find it tricky to get lists into specific cells of dataframes. The way I use is to use a pd.Series([mylist])
. Notice it is a list of a list. If it were just pd.Series(mylist)
pandas would convert the list into a series and you'd get a series of series which is a dataframe.
try:
def fnl(x):
l = pd.date_range(x.loc['first_date'], x.loc['last_date'], freq='B')
return pd.Series([l])
df['range'] = df.apply(fnl, axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With