Modified from this example:
import io
import pandas as pd
import matplotlib.pyplot as plt
data = io.StringIO('''\
Values
1992-08-27 07:46:48,1
1992-08-27 08:00:48,2
1992-08-27 08:33:48,4
1992-08-27 08:43:48,3
1992-08-27 08:48:48,1
1992-08-27 08:51:48,5
1992-08-27 08:53:48,4
1992-08-27 08:56:48,2
1992-08-27 09:03:48,1
''')
s = pd.read_csv(data, squeeze=True)
s.index = pd.to_datetime(s.index)
res = s.resample('4s').interpolate('linear')
print(res)
plt.plot(res, '.-')
plt.plot(s, 'o')
plt.grid(True)
It works as expected:
1992-08-27 07:46:48 1.000000
1992-08-27 07:46:52 1.004762
1992-08-27 07:46:56 1.009524
1992-08-27 07:47:00 1.014286
1992-08-27 07:47:04 1.019048
1992-08-27 07:47:08 1.023810
1992-08-27 07:47:12 1.028571
....
but if I change the resample to '5s'
, it produces only NaNs:
1992-08-27 07:46:45 NaN
1992-08-27 07:46:50 NaN
1992-08-27 07:46:55 NaN
1992-08-27 07:47:00 NaN
1992-08-27 07:47:05 NaN
1992-08-27 07:47:10 NaN
1992-08-27 07:47:15 NaN
....
Why?
The problem is, that it ignores the nans.
Resampling is used to either increase the sample rate (make the image larger) or decrease it (make the image smaller). Interpolation is the process of calculating values between sample points. So, if you resample an image you can use interpolation to do it.
Pandas Series: resample() function The resample() function is used to resample time-series data. Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.
You can interpolate missing values ( NaN ) in pandas. DataFrame and Series with interpolate() . This article describes the following contents. Use dropna() and fillna() to remove missing values NaN or to fill them with a specific value.
Option 1
That's because '4s'
aligns perfectly with your existing index. When you resample
, you get representation from your old series and are able to interpolate. What you want to do is to create an index that is the union of the old index with a new index. Then interpolate and reindex with a new index.
oidx = s.index
nidx = pd.date_range(oidx.min(), oidx.max(), freq='5s')
res = s.reindex(oidx.union(nidx)).interpolate('index').reindex(nidx)
res.plot(style='.-')
s.plot(style='o')
Option 2A
If you are willing to forgo accuracy, you can ffill
with a limit of 1
res = s.resample('5s').ffill(limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')
Option 2B
Same thing with bfill
res = s.resample('5s').bfill(limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')
Option 3
Intermediate complexity and accuracy
nidx = pd.date_range(oidx.min(), oidx.max(), freq='5s')
res = s.reindex(nidx, method='nearest', limit=1).interpolate()
res.plot(style='.-')
s.plot(style='o')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With