Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python compare date list to start and end date columns in dataframe

Tags:

python

pandas

Problem: I have a dataframe with two columns: Start date and End date. I also have a list of dates. So lets say the data looks something like this:

data = [[1/1/2018,3/1/2018],[2/1/2018,3/1/2018],[4/1/2018,6/1/2018]]
df = pd.DataFrame(data,columns=['startdate','enddate'])

dates=[1/1/2018,2/1/2018]

What I need to do is:

1)Create a new column for each date in the dates list

2)for each row in the df, if the date for the new column created is in between the start and end date, assign a 1; if not, assign a 0.

I have tried to use zip but then I realized that the df rows will be thousands of rows, where as the dates list will contain about 24 items (spanning 2 years), so it stops when the dates list is exhausted, i.e., at 24.

So below is what the original df looks like and how it should look like afterwards:

Before:

   startdate    enddate
0 2018-01-01 2018-03-01
1 2018-02-01 2018-03-01
2 2018-04-01 2018-06-01

After:

  startdate   enddate 1/1/2018 2/1/2018
0  1/1/2018  3/1/2018        1        1
1  2/1/2018  3/1/2018        0        1
2  4/1/2018  6/1/2018        0        0

Any help on this would be much appreciated, thanks!

like image 288
rkhan8 Avatar asked Dec 22 '25 08:12

rkhan8


1 Answers

Using numpy broadcast

s1=df.startdate.values
s2=df.enddate.values
v=pd.to_datetime(pd.Series(dates)).values[:,None]


newdf=pd.DataFrame(((s1<=v)&(s2>=v)).T.astype(int),columns=dates,index=df.index)
pd.concat([df,newdf],axis=1)
   startdate    enddate  1/1/2018  2/1/2018
0 2018-01-01 2018-03-01         1         1
1 2018-02-01 2018-03-01         0         1
2 2018-04-01 2018-06-01         0         0
like image 67
BENY Avatar answered Dec 24 '25 00:12

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!