Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

B Spline confusion

I realise that there are posts on the topic of B-Splines on this board but those have actually made me more confused so I thought someone might be able to help me.

I have simulated data for x-values ranging from 0 to 1. I'd like to fit to my data a cubic spline (degree = 3) with knots at 0, 0.1, 0.2, ... , 0.9, 1. I'd also like to use the B-Spline basis and OLS for parameter estimation (I'm not looking for penalised splines).

I think I need the bs function from the spline package but I'm not quite sure and I also don't know what exactly to feed it.

I'd also like to plot the resulting polynomial spline.

Thanks!

like image 313
user2249626 Avatar asked Apr 05 '13 15:04

user2249626


People also ask

What is B-spline curve explain it?

A B-spline curve is defined as a linear combination of control points and B-spline basis functions given by. (1.62) In this context the control points are called de Boor points. The basis function is defined on a knot vector.

Are B-splines differentiable?

A B-spline function is the maximally differentiable interpolative basis function.

What are main characteristics of the B-spline curve?

Properties of B-spline Curve :Each basis function has 0 or +ve value for all parameters. Each basis function has one maximum value except for k=1. The degree of B-spline curve polynomial does not depend on the number of control points which makes it more reliable to use than Bezier curve.

What can you do to control the shape of B-spline?

The simplest, and usually fastest, way to manipulate B-Spline curves is to move B-Spline points closer to or farther from one another. When B-Spline points are moved closer to one another, a sharper curve is created. B-Spline points moved farther from one another create shallower curves.


1 Answers

## simulate some data - from mgcv::magic
set.seed(1)
n <- 400
x <- 0:(n-1)/(n-1)
f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
y <- f + rnorm(n, 0, sd = 2)

## load the splines package - comes with R
require(splines)

You use the bs() function in a formula to lm as you want OLS estimates. bs provides the basis functions as given by the knots, degree of polynomial etc.

mod <- lm(y ~ bs(x, knots = seq(0.1, 0.9, by = 0.1)))

You can treat that just like a linear model.

> anova(mod)
Analysis of Variance Table

Response: y
                                        Df Sum Sq Mean Sq F value    Pr(>F)    
bs(x, knots = seq(0.1, 0.9, by = 0.1))  12 2997.5 249.792  65.477 < 2.2e-16 ***
Residuals                              387 1476.4   3.815                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Some pointers on knot placement. bs has an argument Boundary.knots, with default Boundary.knots = range(x) - hence when I specified the knots argument above, I did not include the boundary knots.

Read ?bs for more information.

Producing a plot of the fitted spline

In the comments I discuss how to draw the fitted spline. One option is to order the data in terms of the covariate. This works fine for a single covariate, but need not work for 2 or more covariates. A further issue is that you can only evaluate the fitted spline at the observed values of x - this is fine if you have densely sampled the covariate, but if not, the spline may look odd, with long linear sections.

A more general solution is to use predict to generate predictions from the model for new values of the covariate or covariates. In the code below I show how to do this for the model above, predicting for 100 evenly-spaced values over the range of x.

pdat <- data.frame(x = seq(min(x), max(x), length = 100))
## predict for new `x`
pdat <- transform(pdat, yhat = predict(mod, newdata = pdat))

## now plot
ylim <- range(pdat$y, y) ## not needed, but may be if plotting CIs too
plot(y ~ x)
lines(yhat ~ x, data = pdat, lwd = 2, col = "red")

That produces

enter image description here

like image 103
Gavin Simpson Avatar answered Sep 28 '22 07:09

Gavin Simpson