Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to turn a date-indexed dataframe containing durations of events, into a dataframe of binary data showing event for each day?

Tags:

python

pandas

I apologise for the poorly worded question but it's quite heard to put into one line.

I have a date-indexed dataframe that contains data relating to the duration of events, like so:

Date           Duration
12-01-2010     5
04-02-2010     1
14-02-2010     241
23-12-2010     6

I would like to turn this into a dataframe, indexed daily, containing binarised data which shows whether or not an event is occurring on a given day. For example, for the first event identified above which lasted 5 days:

Date           Event
12-01-2010     1
13-01-2010     1
14-01-2010     1
15-01-2010     1
16-01-2010     1
17-01-2010     0
18-01-2010     0

Any ideas?

Thanks

like image 748
SHV_la Avatar asked Aug 27 '19 15:08

SHV_la


1 Answers

Assuming you are on pandas 0.25 so you can use explode:

# Generate the list of days that has an event
s = df.apply(lambda row: pd.date_range(row['Date'], periods=row['Duration']), axis=1) \
        .explode() \
        .drop_duplicates()

# First line: we know those days have at least one event so mark them with 1
# Second line: expand it to cover every day of the year and fill the missing days with 0
result = pd.DataFrame({'Event': 1}, index=s) \
            .reindex(pd.date_range('2010-01-01', '2010-12-31'), fill_value=0)

Result:

            Event
2010-01-01      0
2010-01-02      0
2010-01-03      0
2010-01-04      0
2010-01-05      0
2010-01-06      0
2010-01-07      0
2010-01-08      0
2010-01-09      0
2010-01-10      0
2010-01-11      0
2010-01-12      1
2010-01-13      1
2010-01-14      1
2010-01-15      1
2010-01-16      1
2010-01-17      0
2010-01-18      0
2010-01-19      0
2010-01-20      0
like image 105
Code Different Avatar answered Nov 15 '22 00:11

Code Different