Most people are probably familiar with bs
from splines:
library(splines)
workingModel <- lm(mpg ~ factor(gear) + bs(wt, knots = 5) + hp, data = mtcars)
bs(mtcars$wt, knots = 4)
This uses a b-spline for the singe variable weight, but you can also do multivariate splines:
bs(cbind(mtcars$wt,mtcars$hp), knots = 4)
But this produces a matrix with twice as many rows as mtcars
, so when I try:
brokenModel <- lm(mpg ~ bs(cbind(mtcars$wt,mtcars$hp), knots = 4), data = mtcars)
I get an error about differing lengths.
My question is: how do I use the multivariate spline in a model if it has a different number of rows than my outcome variable? Do I stack my outcome variable on top of itself y <- c(y, y)
? Why does the multivariate spline produce extra rows?
Thanks.
You can't use splines::bs
in this case, as it is strictly for construction of a univariate spline. If you do bs(mat)
where mat
is a matrix, it is just doing bs(c(mat))
. For example,
mat <- matrix(runif(8), 4, 2)
identical(bs(mat), bs(c(mat)))
# [1] TRUE
This explains why you get double number of rows, when doing bs(cbind(mtcars$wt,mtcars$hp)
.
To create a 2D spline, the simplest way is to create additive spline:
lm(mpg ~ factor(gear) + bs(wt, knots = 5) + bs(hp, knots = 4), mtcars)
but this may not be what you want. Then consider interaction:
model <- lm(mpg ~ factor(gear) + bs(wt, knots = 5):bs(hp, knots = 4), mtcars)
The bs(wt, knots = 5):bs(hp, knots = 4)
is forming row-wise Kronecker product between the two design matrices. Since bs(wt, knots = 5)
is a matrix of 4 columns, and bs(hp, knots = 4)
is a matrix of 3 columns, the interaction has 4 * 3 = 12 columns.
Alternatively, consider using mgcv
package. In mgcv
, multivariate splines can be constructed in two ways:
Clearly you want the second here, as wt
and hp
have different units. To construct tensor product splines, we can use:
library(mgcv)
fit <- gam(mpg ~ factor(gear)
+ s(wt, bs = 'cr', k = 4, fx = TRUE)
+ s(hp, bs = 'cr', k = 4, fx = TRUE)
+ ti(wt, hp, bs = 'cr', k = c(4, 4), d = c(1, 1), fx = TRUE),
data = mtcars)
Here, I specially set fx = TRUE
to disable penalized regression.
I don't want to write an extensive answer to introduce mgcv
. For how s
, ti
and gam
work, just read documentation. If you need to bridge the gap in theory, grab Simon Wood's book published in 2006: Generalized Additive Models: an introduction with R.
A practical example of
mgcv
usage?
I had an answer Cubic spline method for longitudinal series data which might help you get familiar with mgcv
. But as an introductory example, it only shows how to work with univariate spline. Fortunately, that is also the key. Tensor product spline is constructed from univariate spline.
My other answers related to mgcv
is more of theoretical aspect; while not all my answers related to spline
is making reference to mgcv
. So that question & answer is the best I could give you at this stage.
Would the scale invariant tensor product splines be equivalent to radial smoothing or would that be the isotropic thin-place splines?
Radial smoothing is equivalent to thin-plate spline, as the basis function for a thin-plate spline is radial. That is why it is isotropic and can be used in spatial regression.
Tensor product spline is scale invariant, as it is constructed as (pairwise) multiplication of univariate spline basis.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With