I'm using a set of points which go from <code>(-5,5)</code> to <code>(0,0)</code> and <code>(5,5)</code> in a "symmetric V-shape". I'm fitting a model with <code>lm()</code> and the <code>bs()</code> function to fit a "V-shape" spline: <pre class="prettyprint"><code>lm(formula = y ~ bs(x, degree = 1, knots = c(0))) </code></pre> I get the "V-shape" when I predict outcomes by <code>predict()</code> and draw the prediction line. But when I look at the model estimates <code>coef()</code>, I see estimates that I don't expect. <pre class="prettyprint"><code>Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.93821 0.16117 30.639 1.40e-09 *** bs(x, degree = 1, knots = c(0))1 -5.12079 0.24026 -21.313 2.47e-08 *** bs(x, degree = 1, knots = c(0))2 -0.05545 0.21701 -0.256 0.805 </code></pre> I would expect a <code>-1</code> coefficient for the first part and a <code>+1</code> coefficient for the second part. Must I interpret the estimates in a different way? If I fill the knot in the <code>lm()</code> function manually than I get these coefficients: <pre class="prettyprint"><code>Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.18258 0.13558 -1.347 0.215 x -1.02416 0.04805 -21.313 2.47e-08 *** z 2.03723 0.08575 23.759 1.05e-08 *** </code></pre> That's more like it. Z's (point of knot) relative change to x is ~ +1 I want to understand how to interpret the <code>bs()</code> result. I've checked, the manual and <code>bs</code> model prediction values are exact the same.

<blockquote> I would expect a <code>-1</code> coefficient for the first part and a <code>+1</code> coefficient for the second part. </blockquote> I think your question is really about what is a B-spline function. If you want to understand the meaning of coefficients, you need to know what basis functions are for your spline. See the following: <pre class="prettyprint"><code>library(splines) x <- seq(-5, 5, length = 100) b <- bs(x, degree = 1, knots = 0) ## returns a basis matrix str(b) ## check structure b1 <- b[, 1] ## basis 1 b2 <- b[, 2] ## basis 2 par(mfrow = c(1, 2)) plot(x, b1, type = "l", main = "basis 1: b1") plot(x, b2, type = "l", main = "basis 2: b2") </code></pre> <img src="https://i.stack.imgur.com/T7Ltn.jpg" alt="basis"> Note: <ol> <li>B-splines of degree-1 are tent functions, as you can see from <code>b1</code>;</li> <li>B-splines of degree-1 are scaled, so that their functional value is between <code>(0, 1)</code>;</li> <li>a knots of a B-spline of degree-1 is where it bends;</li> <li>B-splines of degree-1 are compact, and are only non-zero over (no more than) three adjacent knots.</li> </ol> You can get the (recursive) expression of B-splines from Definition of B-spline. B-spline of degree 0 is the most basis class, while <ul> <li>B-spline of degree 1 is a linear combination of B-spline of degree 0</li> <li>B-spline of degree 2 is a linear combination of B-spline of degree 1</li> <li>B-spline of degree 3 is a linear combination of B-spline of degree 2</li> </ul> (Sorry, I was getting off-topic...) Your linear regression using B-splines: <pre class="prettyprint"><code>y ~ bs(x, degree = 1, knots = 0) </code></pre> is just doing: <pre class="prettyprint"><code>y ~ b1 + b2 </code></pre> Now, you should be able to understand what coefficient you get mean, it means that the spline function is: <pre class="prettyprint"><code>-5.12079 * b1 - 0.05545 * b2 </code></pre> In summary table: <pre class="prettyprint"><code>Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.93821 0.16117 30.639 1.40e-09 *** bs(x, degree = 1, knots = c(0))1 -5.12079 0.24026 -21.313 2.47e-08 *** bs(x, degree = 1, knots = c(0))2 -0.05545 0.21701 -0.256 0.805 </code></pre> You might wonder why the coefficient of <code>b2</code> is not significant. Well, compare your <code>y</code> and <code>b1</code>: Your <code>y</code> is symmetric V-shape, while <code>b1</code> is reverse symmetric V-shape. If you first multiply <code>-1</code> to <code>b1</code>, and rescale it by multiplying 5, (this explains the coefficient <code>-5</code> for <code>b1</code>), what do you get? Good match, right? So there is no need for <code>b2</code>. However, if your <code>y</code> is asymmetric, running trough <code>(-5,5)</code> to <code>(0,0)</code>, then to <code>(5,10)</code>, then you will notice that coefficients for <code>b1</code> and <code>b2</code> are both significant. I think the other answer already gave you such example. <hr> Reparametrization of fitted B-spline to piecewise polynomial is demonstrated here: Reparametrize fitted regression spline as piece-wise polynomials and export polynomial coefficients.

A simple example of first degree spline with single knot and interpretation of the estimated coefficients to calculate the slope of the fitted lines: <pre class="prettyprint"><code>library(splines) set.seed(313) x<-seq(-5,+5,len=1000) y<-c(seq(5,0,len=500)+rnorm(500,0,0.25), seq(0,10,len=500)+rnorm(500,0,0.25)) plot(x,y, xlim = c(-6,+6), ylim = c(0,+8)) fit <- lm(formula = y ~ bs(x, degree = 1, knots = c(0))) x.predict <- seq(-2.5,+2.5,len = 100) lines(x.predict, predict(fit, data.frame(x = x.predict)), col =2, lwd = 2) </code></pre> produces plot <img src="https://i.stack.imgur.com/GHOpB.png" alt="enter image description here"> Since we are fitting a spline with <code>degree=1</code> (i.e. straight line) and with a knot at <code>x=0</code>, we have two lines for <code>x<=0</code> and <code>x>0</code>. The coefficients are <pre class="prettyprint"><code>> round(summary(fit)$coefficients,3) Estimate Std. Error t value Pr(>|t|) (Intercept) 5.014 0.021 241.961 0 bs(x, degree = 1, knots = c(0))1 -5.041 0.030 -166.156 0 bs(x, degree = 1, knots = c(0))2 4.964 0.027 182.915 0 </code></pre> Which can be translated into the slopes for each of the straight line using the knot (which we specified at <code>x=0</code>) and boundary knots (min/max of the explanatory data): <pre class="prettyprint"><code># two boundary knots and one specified knot.boundary.left <- min(x) knot <- 0 knot.boundary.right <- max(x) slope.1 <- summary(fit)$coefficients[2,1] /(knot - knot.boundary.left) slope.2 <- (summary(fit)$coefficients[3,1] - summary(fit)$coefficients[2,1]) / (knot.boundary.right - knot) slope.1 slope.2 > slope.1 [1] -1.008238 > slope.2 [1] 2.000988 </code></pre>

How to interpret lm() coefficient estimates when using bs() function for splines

Tags:

r

regression

spline

lm

bspline

I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline:

lm(formula = y ~ bs(x, degree = 1, knots = c(0)))

I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. But when I look at the model estimates coef(), I see estimates that I don't expect.

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part. Must I interpret the estimates in a different way?

If I fill the knot in the lm() function manually than I get these coefficients:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.18258    0.13558  -1.347    0.215    
x           -1.02416    0.04805 -21.313 2.47e-08 ***
z            2.03723    0.08575  23.759 1.05e-08 ***

That's more like it. Z's (point of knot) relative change to x is ~ +1

I want to understand how to interpret the bs() result. I've checked, the manual and bs model prediction values are exact the same.

535

asked May 21 '16 12:05

PDG

2 Answers

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part.

I think your question is really about what is a B-spline function. If you want to understand the meaning of coefficients, you need to know what basis functions are for your spline. See the following:

library(splines)
x <- seq(-5, 5, length = 100)
b <- bs(x, degree = 1, knots = 0)  ## returns a basis matrix
str(b)  ## check structure
b1 <- b[, 1]  ## basis 1
b2 <- b[, 2]  ## basis 2
par(mfrow = c(1, 2))
plot(x, b1, type = "l", main = "basis 1: b1")
plot(x, b2, type = "l", main = "basis 2: b2")

basis

Note:

B-splines of degree-1 are tent functions, as you can see from b1;
B-splines of degree-1 are scaled, so that their functional value is between (0, 1);
a knots of a B-spline of degree-1 is where it bends;
B-splines of degree-1 are compact, and are only non-zero over (no more than) three adjacent knots.

You can get the (recursive) expression of B-splines from Definition of B-spline. B-spline of degree 0 is the most basis class, while

B-spline of degree 1 is a linear combination of B-spline of degree 0
B-spline of degree 2 is a linear combination of B-spline of degree 1
B-spline of degree 3 is a linear combination of B-spline of degree 2

(Sorry, I was getting off-topic...)

Your linear regression using B-splines:

y ~ bs(x, degree = 1, knots = 0)

is just doing:

y ~ b1 + b2

Now, you should be able to understand what coefficient you get mean, it means that the spline function is:

-5.12079 * b1 - 0.05545 * b2

In summary table:

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805

You might wonder why the coefficient of b2 is not significant. Well, compare your y and b1: Your y is symmetric V-shape, while b1 is reverse symmetric V-shape. If you first multiply -1 to b1, and rescale it by multiplying 5, (this explains the coefficient -5 for b1), what do you get? Good match, right? So there is no need for b2.

However, if your y is asymmetric, running trough (-5,5) to (0,0), then to (5,10), then you will notice that coefficients for b1 and b2 are both significant. I think the other answer already gave you such example.

Reparametrization of fitted B-spline to piecewise polynomial is demonstrated here: Reparametrize fitted regression spline as piece-wise polynomials and export polynomial coefficients.

140

answered Sep 28 '22 11:09

Zheyuan Li

A simple example of first degree spline with single knot and interpretation of the estimated coefficients to calculate the slope of the fitted lines:

library(splines)
set.seed(313)
x<-seq(-5,+5,len=1000)
y<-c(seq(5,0,len=500)+rnorm(500,0,0.25),
     seq(0,10,len=500)+rnorm(500,0,0.25))
plot(x,y, xlim = c(-6,+6), ylim = c(0,+8))
fit <- lm(formula = y ~ bs(x, degree = 1, knots = c(0)))
x.predict <- seq(-2.5,+2.5,len = 100)
lines(x.predict, predict(fit, data.frame(x = x.predict)), col =2, lwd = 2)

produces plot enter image description here Since we are fitting a spline with degree=1 (i.e. straight line) and with a knot at x=0, we have two lines for x<=0 and x>0.

The coefficients are

> round(summary(fit)$coefficients,3)
                                 Estimate Std. Error  t value Pr(>|t|)
(Intercept)                         5.014      0.021  241.961        0
bs(x, degree = 1, knots = c(0))1   -5.041      0.030 -166.156        0
bs(x, degree = 1, knots = c(0))2    4.964      0.027  182.915        0

Which can be translated into the slopes for each of the straight line using the knot (which we specified at x=0) and boundary knots (min/max of the explanatory data):

# two boundary knots and one specified
knot.boundary.left <- min(x)
knot <- 0
knot.boundary.right <- max(x)

slope.1 <- summary(fit)$coefficients[2,1] /(knot - knot.boundary.left)
slope.2 <- (summary(fit)$coefficients[3,1] - summary(fit)$coefficients[2,1]) / (knot.boundary.right - knot)
slope.1
slope.2
> slope.1
[1] -1.008238
> slope.2
[1] 2.000988

answered Sep 28 '22 10:09

rbm

Related questions
                            
                                ggplot2: Changing the layout of the legend
                            
                                How to create a pivot table in R with multiple (3+) variables
                            
                                Enriching a ggplot2 plot with multiple geom_segment in a loop?
                            
                                Error bars for barplot only in one direction
                            
                                Replace NA values by row means
                            
                                Select only rows if its value in a particular column is 'NA' in R
                            
                                How to sum over diagonals of data frame
                            
                                how to cumulatively add values in one vector in R
                            
                                Round vector of numerics to integer while preserving their sum
                            
                                Classification - Usage of factor levels
                            
                                R count number of commas and string
                            
                                regex to pickout some text between parenthesis [duplicate]
                            
                                ggplot multiple grouping bar
                            
                                How to get week starting date from a date in R [duplicate]
                            
                                R error "could not find function 'multiplot' " using Cookbook example
                            
                                Find which interval row in a data frame that each element of a vector belongs in
                            
                                Splitting String based on letters case
                            
                                What is the difference between these two comparisons? [duplicate]
                            
                                Implementation of skyline query or efficient frontier
                            
                                R - count all combinations

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With