I am attempting to downsample monthly data to weekly data and have a time series dataframe of months that looks like this:
qty
PERIOD_NAME
2017-09-01 49842.0
2017-10-01 27275.0
2017-11-01 29159.0
2017-12-01 51344.0
2018-01-01 19103.0
2018-02-01 23570.0
2018-03-01 45139.0
2018-04-01 25722.0
2018-05-01 22644.0
I've attempted using a resample to weeks like this:
tgt_item_by_445_wk = tgt_item_by_445_wk.resample('W').sum()
which yields:
qty
PERIOD_NAME
2017-09-03 49842.0
2017-09-10 0.0
2017-09-17 0.0
2017-09-24 0.0
2017-10-01 27275.0
2017-10-08 0.0
2017-10-15 0.0
2017-10-22 0.0
2017-10-29 0.0
I've tried interpolation, but I can't get what I am looking for, which is a fill of the unsampled (0's) with an even split of the first value like this:
qty
PERIOD_NAME
2017-09-03 12460.5
2017-09-10 12460.5
2017-09-17 12460.5
2017-09-24 12460.5
2017-10-01 5455.0
2017-10-08 5455.0
2017-10-15 5455.0
2017-10-22 5455.0
2017-10-29 5455.0
Is there some method using resample, fills and interpolation that allows this?
Pandas Series: resample() functionThe resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.
As previously mentioned, resample() is a method of pandas dataframes that can be used to summarize data by date or time. The . sum() method will add up all values for each resampling period (e.g. for each day) to provide a summary output value for that period.
Method 1: using Python for-loops. Function new_case_count() takes in DataFrame object, iterates over it and converts indexes, which are dates in string format, to Pandas Datetime format. Based on the date's day of the week, each week's new cases count is calculated and stored in a list.
Let's try asfreq
and groupby
.
v = df.asfreq('W', method='ffill')
v /= v.groupby(v.index.strftime('%Y-%m')).transform('count')
qty
PERIOD_NAME
2017-09-03 12460.50
2017-09-10 12460.50
2017-09-17 12460.50
2017-09-24 12460.50
2017-10-01 5455.00
2017-10-08 5455.00
2017-10-15 5455.00
2017-10-22 5455.00
2017-10-29 5455.00
2017-11-05 7289.75
2017-11-12 7289.75
2017-11-19 7289.75
2017-11-26 7289.75
2017-12-03 10268.80
2017-12-10 10268.80
2017-12-17 10268.80
2017-12-24 10268.80
2017-12-31 10268.80
2018-01-07 4775.75
2018-01-14 4775.75
2018-01-21 4775.75
2018-01-28 4775.75
2018-02-04 5892.50
2018-02-11 5892.50
2018-02-18 5892.50
2018-02-25 5892.50
2018-03-04 11284.75
2018-03-11 11284.75
2018-03-18 11284.75
2018-03-25 11284.75
2018-04-01 5144.40
2018-04-08 5144.40
2018-04-15 5144.40
2018-04-22 5144.40
2018-04-29 5144.40
This works well since your values are always on the first of each month. Alternatively, you may use
v /= v.groupby(v.qty).transform('count').values
for the second step.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With