Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Python equivalent to the smooth.spline function in R

The smooth.spline function in R allows a tradeoff between roughness (as defined by the integrated square of the second derivative) and fitting the points (as defined by summing the squares of the residuals). This tradeoff is accomplished by the spar or df parameter. At one extreme you get the least squares line, and the other you get a very wiggly curve which intersects all of the data points (or the mean if you have duplicated x values with different y values)

I have looked at scipy.interpolate.UnivariateSpline and other spline variants in Python, however, they seem to only tradeoff by increasing the number of knots, and setting a threshold (called s) for the allowed SS residuals. By contrast, the smooth.spline in R allows having knots at all the x values, without necessarily having a wiggly curve that hits all the points -- the penalty comes from the second derivative.

Does Python have a spline fitting mechanism that behaves in this way? Allowing all knots but penalizing the second derivative?

like image 535
blucena Avatar asked Mar 27 '15 23:03

blucena


People also ask

What is spline in Python?

WHAT IS A SPLINE? A Spline is essentially a piecewise regression line. Trying to fit one regression line over a very dynamic set of data can let to a lot of compromise. You can tailor your line to fit one area well, but then can often suffer from overfitting in other areas as a consequence.

What is smooth spline in R?

Smoothing splines are a powerful approach for estimating functional relationships between a predictor X and a response Y. Smoothing splines can be fit using either the smooth. spline function (in the stats package) or the ss function (in the npreg package).

What are splines in R?

Splines are a smooth and flexible way of fitting Non linear Models and learning the Non linear interactions from the data.In most of the methods in which we fit Non linear Models to data and learn Non linearities is by transforming the data or the variables by applying a Non linear transformation.

What is Cubic spline in R?

Cubic regression spline is a form of generalized linear models in regression analysis. Also known as B-spline, it is supported by a series of interior basis functions on the interval with chosen knots. Cubic regression splines are widely used on modeling nonlinear data and interaction between variables.


1 Answers

You can use R functions in Python with rpy2:

import rpy2.robjects as robjects
r_y = robjects.FloatVector(y_train)
r_x = robjects.FloatVector(x_train)

r_smooth_spline = robjects.r['smooth.spline'] #extract R function# run smoothing function
spline1 = r_smooth_spline(x=r_x, y=r_y, spar=0.7)
ySpline=np.array(robjects.r['predict'](spline1,robjects.FloatVector(x_smooth)).rx2('y'))
plt.plot(x_smooth,ySpline)

If you want to directly set lambda: spline1 = r_smooth_spline(x=r_x, y=r_y, lambda=42) doesn't work, because lambda has already another meaning in Python, but there is a solution: How to use the lambda argument of smooth.spline in RPy WITHOUT Python interprating it as lambda.

To get the code running you first need to define the data x_train and y_train and you can define x_smooth=np.array(np.linspace(-3,5,1920)). if you want to plot it between -3 and 5 in Full-HD-resolution.

like image 55
Jakob Avatar answered Oct 03 '22 01:10

Jakob