I am trying to use LOWESS to smooth the following data:
I would like to obtain a smooth line that filters out the spikes in the data. My code is as follows:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import HourLocator, DayLocator, DateFormatter
from statsmodels.nonparametric.smoothers_lowess import lowess
file = r'C:...'
df = pd.read_csv(file) # reads data file
df['Date'] = pd.to_datetime(df['Time Local'], format='%d/%m/%Y %H:%M')
x = df['Date']
y1 = df['CTk2 Level']
filtered = lowess(y1, x, is_sorted=True, frac=0.025, it=0)
plt.plot(x, y1, 'r')
plt.plot(filtered[:,0], filtered[:,1], 'b')
plt.show()
When I run this code, I get the following error:
ValueError: view limit minimum -7.641460199922635e+16 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units
The date in my data is in the format 07/05/2018 00:07:00. I think the issue is that the LOWESS is struggling to work with the datetime data, but not sure?
Can you please help me?
Lowess doesn't respect the DateTimeIndex type and instead just returns the dates as nanoseconds since epoch. Luckily it is easy to convert back:
smoothedx, smoothedy = lowess(y1, x, is_sorted=True, frac=0.025, it=0)
smoothedx = smoothedx.astype('datetime64[s]')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With