Pandas resampling from months to weeks

Tags:

python

pandas

I am attempting to downsample monthly data to weekly data and have a time series dataframe of months that looks like this:

             qty
PERIOD_NAME 
2017-09-01  49842.0
2017-10-01  27275.0
2017-11-01  29159.0
2017-12-01  51344.0
2018-01-01  19103.0
2018-02-01  23570.0
2018-03-01  45139.0
2018-04-01  25722.0
2018-05-01  22644.0

I've attempted using a resample to weeks like this:

tgt_item_by_445_wk = tgt_item_by_445_wk.resample('W').sum()

which yields:

             qty
PERIOD_NAME 
2017-09-03  49842.0
2017-09-10  0.0
2017-09-17  0.0
2017-09-24  0.0
2017-10-01  27275.0
2017-10-08  0.0
2017-10-15  0.0
2017-10-22  0.0
2017-10-29  0.0

I've tried interpolation, but I can't get what I am looking for, which is a fill of the unsampled (0's) with an even split of the first value like this:

              qty
PERIOD_NAME 
2017-09-03  12460.5
2017-09-10  12460.5
2017-09-17  12460.5
2017-09-24  12460.5
2017-10-01  5455.0
2017-10-08  5455.0
2017-10-15  5455.0
2017-10-22  5455.0
2017-10-29  5455.0

Is there some method using resample, fills and interpolation that allows this?

824

asked Jun 16 '18 19:06

gman123

1 Answers

Let's try asfreq and groupby.

v = df.asfreq('W', method='ffill')
v /= v.groupby(v.index.strftime('%Y-%m')).transform('count')

                  qty
PERIOD_NAME          
2017-09-03   12460.50
2017-09-10   12460.50
2017-09-17   12460.50
2017-09-24   12460.50
2017-10-01    5455.00
2017-10-08    5455.00
2017-10-15    5455.00
2017-10-22    5455.00
2017-10-29    5455.00
2017-11-05    7289.75
2017-11-12    7289.75
2017-11-19    7289.75
2017-11-26    7289.75
2017-12-03   10268.80
2017-12-10   10268.80
2017-12-17   10268.80
2017-12-24   10268.80
2017-12-31   10268.80
2018-01-07    4775.75
2018-01-14    4775.75
2018-01-21    4775.75
2018-01-28    4775.75
2018-02-04    5892.50
2018-02-11    5892.50
2018-02-18    5892.50
2018-02-25    5892.50
2018-03-04   11284.75
2018-03-11   11284.75
2018-03-18   11284.75
2018-03-25   11284.75
2018-04-01    5144.40
2018-04-08    5144.40
2018-04-15    5144.40
2018-04-22    5144.40
2018-04-29    5144.40

This works well since your values are always on the first of each month. Alternatively, you may use

v /= v.groupby(v.qty).transform('count').values

for the second step.

103

answered Oct 19 '22 21:10

cs95

Related questions
                            
                                PyMySQL Access Denied "using password (no") but using password
                            
                                How to use two models in Tensorflow object Detection API
                            
                                Params for functions in jupyter lab w/ Python
                            
                                How to convert CIDR to IP ranges using python3?
                            
                                Tensorflow parsing and reshaping float list in Dataset.map()
                            
                                Reenable urllib3 warnings
                            
                                How to select top n row from each group after group by in pandas?
                            
                                Raise close spider from Scrapy pipeline
                            
                                urlparse fails with simple url
                            
                                Is there a function in google.colab module to close the runtime
                            
                                Pythonic way to hold related variables?
                            
                                pytest: how to use a mark to inject a fixture?
                            
                                Selenium Python - Get a list of all loaded URLs (images, scripts, stylesheets etc)
                            
                                what is the entry point to python source code
                            
                                python exception handling inside with block
                            
                                Why the following operands could not be broadcasted together?
                            
                                Parse a dataframe column by comma and pivot - python
                            
                                custom scaling of wind rose python
                            
                                How can I calculate pct_change() in pandas across two columns, row by row?
                            
                                Dataframe Join Null-Safe Condition Use

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With