I apologise for the poorly worded question but it's quite heard to put into one line.
I have a date-indexed dataframe that contains data relating to the duration of events, like so:
Date Duration
12-01-2010 5
04-02-2010 1
14-02-2010 241
23-12-2010 6
I would like to turn this into a dataframe, indexed daily, containing binarised data which shows whether or not an event is occurring on a given day. For example, for the first event identified above which lasted 5 days:
Date Event
12-01-2010 1
13-01-2010 1
14-01-2010 1
15-01-2010 1
16-01-2010 1
17-01-2010 0
18-01-2010 0
Any ideas?
Thanks
Assuming you are on pandas 0.25 so you can use explode
:
# Generate the list of days that has an event
s = df.apply(lambda row: pd.date_range(row['Date'], periods=row['Duration']), axis=1) \
.explode() \
.drop_duplicates()
# First line: we know those days have at least one event so mark them with 1
# Second line: expand it to cover every day of the year and fill the missing days with 0
result = pd.DataFrame({'Event': 1}, index=s) \
.reindex(pd.date_range('2010-01-01', '2010-12-31'), fill_value=0)
Result:
Event
2010-01-01 0
2010-01-02 0
2010-01-03 0
2010-01-04 0
2010-01-05 0
2010-01-06 0
2010-01-07 0
2010-01-08 0
2010-01-09 0
2010-01-10 0
2010-01-11 0
2010-01-12 1
2010-01-13 1
2010-01-14 1
2010-01-15 1
2010-01-16 1
2010-01-17 0
2010-01-18 0
2010-01-19 0
2010-01-20 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With