Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Predicting on new data using locally weighted regression (LOESS/LOWESS)

Tags:

How to fit a locally weighted regression in python so that it can be used to predict on new data?

There is statsmodels.nonparametric.smoothers_lowess.lowess, but it returns the estimates only for the original data set; so it seems to only do fit and predict together, rather than separately as I expected.

scikit-learn always has a fit method that allows the object to be used later on new data with predict; but it doesn't implement lowess.

like image 225
max Avatar asked Mar 27 '16 20:03

max


People also ask

What is the difference between LOESS and Lowess?

lowess is for adding a smooth curve to a scatterplot, i.e., for univariate smoothing. loess is for fitting a smooth surface to multivariate data. Both algorithms use locally-weighted polynomial regression, usually with robustifying iterations.

How does Lowess smoothing work?

LOWESS (Locally Weighted Scatterplot Smoothing), sometimes called LOESS (locally weighted smoothing), is a popular tool used in regression analysis that creates a smooth line through a timeplot or scatter plot to help you to see relationship between variables and foresee trends.

What is the LOESS and Lowess used for what is the process of computing LOESS?

LOESS is based on the ideas that any function can be well approximated in a small neighborhood by a low-order polynomial and that simple models can be fit to data easily. High-degree polynomials would tend to overfit the data in each subset and are numerically unstable, making accurate computations difficult.

What is Lowess algorithm?

LOESS and LOWESS filters are very popular smoothing methods that use a locally weighted regression function. This method uses a weighting function with the effect that the influence of a neighboring value on the smoothed value at a certain position decreases with their distance to that position.


Video Answer


1 Answers

Lowess works great for predicting (when combined with interpolation)! I think the code is pretty straightforward-- let me know if you have any questions! Matplolib Figure

import matplotlib.pyplot as plt
%matplotlib inline
from scipy.interpolate import interp1d
import statsmodels.api as sm

# introduce some floats in our x-values
x = list(range(3, 33)) + [3.2, 6.2]
y = [1,2,1,2,1,1,3,4,5,4,5,6,5,6,7,8,9,10,11,11,12,11,11,10,12,11,11,10,9,8,2,13]

# lowess will return our "smoothed" data with a y value for at every x-value
lowess = sm.nonparametric.lowess(y, x, frac=.3)

# unpack the lowess smoothed points to their values
lowess_x = list(zip(*lowess))[0]
lowess_y = list(zip(*lowess))[1]

# run scipy's interpolation. There is also extrapolation I believe
f = interp1d(lowess_x, lowess_y, bounds_error=False)

xnew = [i/10. for i in range(400)]

# this this generate y values for our xvalues by our interpolator
# it will MISS values outsite of the x window (less than 3, greater than 33)
# There might be a better approach, but you can run a for loop
#and if the value is out of the range, use f(min(lowess_x)) or f(max(lowess_x))
ynew = f(xnew)


plt.plot(x, y, 'o')
plt.plot(lowess_x, lowess_y, '*')
plt.plot(xnew, ynew, '-')
plt.show()
like image 114
Daniel Hitchcock Avatar answered Sep 21 '22 07:09

Daniel Hitchcock