Expand rows by date range having start and end in Pandas

Tags:

I'm working with a data set containing information on a phenomenon occurring during some time frames. I am given the start and end time of the event and its severity, as well as some other information. I would like to expand these frames over some larger time period by expanding the rows within set time periods and leaving the rest of the information as NaNs.

Data set example:

                         date_end         severity   category
     date_start           
2018-01-04 07:00:00  2018-01-04 10:00:00     12          1
2018-01-04 12:00:00  2018-01-04 13:00:00     44          2

What I want is:

                     severity   category
     date_start           
2018-01-04 07:00:00     12         1
2018-01-04 08:00:00     12         1
2018-01-04 09:00:00     12         1
2018-01-04 10:00:00     12         1
2018-01-04 11:00:00     nan       nan
2018-01-04 12:00:00     44         2
2018-01-04 13:00:00     44         2
2018-01-04 14:00:00     nan       nan
2018-01-04 15:00:00     nan       nan

What would be an efficient way of achieving such a result?

433

asked Aug 16 '19 13:08

Aleks-1and

1 Answers

Assuming you are on pandas v0.25, use explode:

df['hour'] = df.apply(lambda row: pd.date_range(row.name, row['date_end'], freq='H'), axis=1)
df = df.explode('hour').reset_index() \
        .drop(columns=['date_start', 'date_end']) \
        .rename(columns={'hour': 'date_start'}) \
        .set_index('date_start')

For the rows with nan, you may reindex your dataframe.

# Report from Jan 4 - 5, 2018, from 7AM - 7PM
days = pd.date_range('2018-01-04', '2018-01-05')
hours = pd.to_timedelta(range(7, 20), unit='h')
tmp = pd.MultiIndex.from_product([days, hours], names=['Date', 'Hour']).to_frame()

s = tmp['Date'] + tmp['Hour']
df.reindex(s)

answered Sep 27 '22 17:09

Code Different

Related questions
                            
                                AttributeError: 'int' object has no attribute 'lower' in TFIDF and CountVectorizer
                            
                                Parallel loading of Input Files in Pandas Dataframe
                            
                                How to execute file.py on HTML button press using Django?
                            
                                sort Persian strings for python [duplicate]
                            
                                convert Dataframe to 2d Array
                            
                                More efficient method of finding minimum sum after k operations
                            
                                How To Call Postgres 11 Stored Procedure From Python
                            
                                Could not find a version that satisfies the requirement flask (from versions: ) No matching distribution found for flask
                            
                                Sum only numeric columns in pandas
                            
                                What is the process "python3 unattended upgrade shutdown"?
                            
                                Storing OAuth Token in Python Library
                            
                                Is it possible to sort a list with reduce?
                            
                                `try ... except not` construction
                            
                                COCO api evaluation for subset of classes
                            
                                Sum column based on another column in Pandas DataFrame
                            
                                compute maximum f1 score using precision_recall_curve?
                            
                                AttributeError when using callback Tensorboard on Keras: 'Model' object has no attribute 'run_eagerly'
                            
                                Find index of the first and/or last value in a column that is not NaN
                            
                                Setup IR Remote Control Using LIRC for the Raspberry PI (RPi)
                            
                                List of LISTS of tuples to Pandas dataframe?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Expand rows by date range having start and end in Pandas

Tags:

python

date

datetime

pandas

Aleks-1and

People also ask

1 Answers

Code Different

Recent Activity

Donate For Us