Adding rows for each month in a dataframe based on column date

Tags:

I am dealing with financial data which i need to extrapolate for different months. Here is my dataframe:

invoice_id,date_from,date_to
30492,2019-02-04,2019-09-18

I want to break this up for different months between date_from and date_to. Hence i need to add rows for each month with month starting date to ending date. Final output should look like:

invoice_id,date_from,date_to
30492,2019-02-04,2019-02-28
30492,2019-03-01,2019-03-31
30492,2019-04-01,2019-04-30
30492,2019-05-01,2019-05-31
30492,2019-06-01,2019-06-30
30492,2019-07-01,2019-07-31
30492,2019-08-01,2019-08-30
30492,2019-09-01,2019-09-18

Need to take care of leap year scenario as well. Is there any native method already available in pandas datetime package which i can use to achieve the desired output ?

974

asked Apr 25 '19 07:04

Prasanna

1 Answers

Use:

print (df)
   invoice_id  date_from    date_to
0       30492 2019-02-04 2019-09-18
1       30493 2019-01-20 2019-03-10

#added months between date_from and date_to
df1 = pd.concat([pd.Series(r.invoice_id,pd.date_range(r.date_from, r.date_to, freq='MS')) 
                 for r in df.itertuples()]).reset_index()
df1.columns = ['date_from','invoice_id']

#added starts of months - sorting for correct positions
df2 = (pd.concat([df[['invoice_id','date_from']], df1], sort=False, ignore_index=True)
         .sort_values(['invoice_id','date_from'])
         .reset_index(drop=True))

#added MonthEnd and date_to  to last rows
mask = df2['invoice_id'].duplicated(keep='last')
s = df2['invoice_id'].map(df.set_index('invoice_id')['date_to'])
df2['date_to'] = np.where(mask, df2['date_from'] + pd.offsets.MonthEnd(), s)

print (df2)
    invoice_id  date_from    date_to
0        30492 2019-02-04 2019-02-28
1        30492 2019-03-01 2019-03-31
2        30492 2019-04-01 2019-04-30
3        30492 2019-05-01 2019-05-31
4        30492 2019-06-01 2019-06-30
5        30492 2019-07-01 2019-07-31
6        30492 2019-08-01 2019-08-31
7        30492 2019-09-01 2019-09-18
8        30493 2019-01-20 2019-01-31
9        30493 2019-02-01 2019-02-28
10       30493 2019-03-01 2019-03-10

answered Sep 26 '22 08:09

jezrael

Related questions
                            
                                ~ Binary Ones Complement in Python 3
                            
                                Plot circular gradients using numpy
                            
                                why TimedRotatingFileHandler does not delete old files?
                            
                                How to change the name of fields using SqlAlchemy-Marshmallow?
                            
                                How to turn multiple rows into multiple headers headers in pandas dataframe
                            
                                How to use df.loc and if condtions in python pandas to delete a row
                            
                                Variable scope is changed in consecutive cells using %%time in Jupyter notebook
                            
                                Ansible not able to find python module
                            
                                Is there any `strip`-liked method for a list?
                            
                                How can I catch a connection refused error in a proper way?
                            
                                How to solve UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte in python
                            
                                Apply multiple StandardScaler's to individual groups?
                            
                                Replace values in dataframe column depending on another column with condition
                            
                                Matplotlib: Aligning two y-axis around zero
                            
                                Bokeh - Do not show tooltip if it has missing value
                            
                                pyenv won't build new python version (hangs)
                            
                                How to import python file as module in Jupyter notebook?
                            
                                How to make a seed to pd.sample like np.random.seed?
                            
                                pip install latest dependency versions
                            
                                How to iterate over a large list without blocking event loop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Adding rows for each month in a dataframe based on column date

Tags:

python

datetime

pandas

calendar

Prasanna

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us