Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lowess Smoothing of Time Series data python

I am trying to use LOWESS to smooth the following data:

Time series data

I would like to obtain a smooth line that filters out the spikes in the data. My code is as follows:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import HourLocator, DayLocator, DateFormatter
from statsmodels.nonparametric.smoothers_lowess import lowess

file = r'C:...'
df = pd.read_csv(file) # reads data file   

df['Date'] = pd.to_datetime(df['Time Local'], format='%d/%m/%Y  %H:%M')     

x = df['Date']  
y1 = df['CTk2 Level'] 

filtered = lowess(y1, x, is_sorted=True, frac=0.025, it=0)

plt.plot(x, y1, 'r')
plt.plot(filtered[:,0], filtered[:,1], 'b')

plt.show()

When I run this code, I get the following error:

ValueError: view limit minimum -7.641460199922635e+16 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units

The date in my data is in the format 07/05/2018 00:07:00. I think the issue is that the LOWESS is struggling to work with the datetime data, but not sure?

Can you please help me?

like image 245
James Avatar asked Dec 11 '18 11:12

James


1 Answers

Lowess doesn't respect the DateTimeIndex type and instead just returns the dates as nanoseconds since epoch. Luckily it is easy to convert back:

smoothedx, smoothedy = lowess(y1, x, is_sorted=True, frac=0.025, it=0)
smoothedx = smoothedx.astype('datetime64[s]')
like image 154
chthonicdaemon Avatar answered Oct 22 '22 05:10

chthonicdaemon