I would like to know how to retrieve or reproduce the design matrix used under the hood by smooth.spline. It does not appear that smooth.spline has a way to return this matrix. As such, I am trying to reproduce the matrix by using the bs function.
When I run smooth.spline and specify the knots via the all.knots argument, the number of parameters from the fit is off by one compared to the bspline matrix I get by passing the same knots to the bs function. Can someone help me understand how to reproduce the training matrix that's being used under the hood by smooth.spline given a known set of knots? Thanks!
Here is code that reproduces the behavior:
n = 200
y = rnorm(n=n)
x = runif(n=n)
knots = c(0,seq(0.001,0.999,.01),1)
mod = smooth.spline(x,y,all.knots=knots,keep.stuff=TRUE)
X = bs(x,knots=knots)
dim(X)[2] #105
length(mod$fit$coef) #104
This doesn't explain how to use the internal data of the output of smooth.spline to get the smoother matrix but we can get that matrix by pumping an identity matrix through smooth.spline (which is also a general method of finding the matrix of a linear operator). Also see https://www.stat.cmu.edu/~cshalizi/dst/18/hw/02/smoother.matrix.R
set.seed(123)
n <- 200
y <- rnorm(n)
x <- runif(n)
mod <- smooth.spline(x, y, keep.stuff = TRUE)
S <- apply(diag(n), 2, function(y) fitted(smooth.spline(x, y, df = mod$df)))
max(abs(S %*% y - fitted(mod)))
## [1] 2.183287e-08
If you only need the diagonal of the smoother matrix then hatvalues(mod) will give that.
max(abs(diag(S) - hatvalues(mod)))
## [1] 1.676781e-08
This computes a matrix X which satisfies Xb = Sy.
b <- mod$fit$coef
X <- tcrossprod(fitted(mod), b/c(crossprod(b)))
max(abs(X %*% b - S %*% y))
## [1] 4.959864e-09
The equation is satisfied because if X = (Sy)b'/(b'b) then Xb = ((Sy)b'/(b'b))b = (Sy)(b'b)/(b'b) = Sy .
Note that such an X is not unique since if we add vectors orthogonal to b to rows of X then the equation is still satisfied.
Although the X we showed does satisfy the equation asked for I am not sure if it really is what you want. Note that looking at the source we can convert mod$auxM to a list of matrices like this:
aux2Mat <- function(auxM) {
stopifnot(is.list(auxM),
identical(vapply(auxM, class, ""),
setNames(rep("numeric", 4), c("XWy", "XWX", "Sigma", "R"))))
## requireNamespace("Matrix")# want sparse matrices
nk <- length(XWy <- auxM[["XWy"]])
list(XWy = XWy,
XWX = Matrix::bandSparse(nk, k= 0:3, diagonals= matrix(auxM[[ "XWX" ]], nk,4), symmetric=TRUE),
Sigma= Matrix::bandSparse(nk, k= 0:3, diagonals= matrix(auxM[["Sigma"]], nk,4), symmetric=TRUE))
}
lM <- aux2Mat(mod$auxM)
A <- lM$XWX + mod$lambda * lM$Sigma
R <- Matrix::chol(A)
The following two lines give the same result:
bb <- solve(A, lM$XWy)
b <- mod$fit$coef
max(abs(b - bb))
## [1] 1.125863e-10
Also these two are the same
crossprod(b, as.matrix(lM$XWX)) %*% b
## [,1]
## [1,] 0.01597881
crossprod(X %*% b)
## [,1]
## [1,] 0.01597881
and these two are the same
y %*% X %*% b
## [,1]
## [1,] 0.02039534
lM$XWy %*% b
## [,1]
## [1,] 0.02039534
Corrections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With