Confidence intervals for Ridge regression

Tags:

I can't do the confidence intervals in a ridge regression. I have this model.

model5 <- glmnet(train_x,train_y,family = "gaussian",alpha=0, lambda=0.01)

And when I do the prediction I use these command:

test_pred <- predict(model5, test_x, type = "link")

Someone knows how to do the confidence interval for the predictions?

923

asked Sep 28 '16 14:09

Ana Laura Carreiras

1 Answers

It turns out that glmnet doesn't offer standard errors (and therefore doesn't give you confidence intervals) as explained here and also addressed in this vignette (excerpt below):

It is a very natural question to ask for standard errors of regression coefficients or other estimated quantities. In principle such standard errors can easily be calculated, e.g. using the bootstrap.

Still, this package deliberately does not provide them. The reason for this is that standard errors are not very meaningful for strongly biased estimates such as arise from penalized estimation methods. Penalized estimation is a procedure that reduces the variance of estimators by introducing substantial bias. The bias of each estimator is therefore a major component of its mean squared error, whereas its variance may contribute only a small part.

Unfortunately, in most applications of penalized regression it is impossible to obtain a sufficiently precise estimate of the bias. Any bootstrap-based calculations can only give an assessment of the variance of the estimates. Reliable estimates of the bias are only available if reliable unbiased estimates are available, which is typically not the case in situations in which penalized estimates are used.

Reporting a standard error of a penalized estimate therefore tells only part of the story. It can give a mistaken impression of great precision, completely ignoring the inaccuracy caused by the bias. It is certainly a mistake to make confidence statements that are only based on an assessment of the variance of the estimates, such as bootstrap-based confidence intervals do.

Reliable confidence intervals around the penalized estimates can be obtained in the case of low dimensional models using the standard generalized linear model theory as implemented in lm, glm and coxph. Methods for constructing reliable confidence intervals in the high-dimensional situation are, to my knowledge, not available.

However, if you insist on confidence intervals, check out this post.

137

answered Oct 06 '22 02:10

ilanman

Related questions
                            
                                Expand Data Frame
                            
                                Add a series of elements in different locations within a vector
                            
                                using dplyr's do() with summary()
                            
                                Are my R scripts identical?
                            
                                R indexing arrays. How to index 3 dimensional array by using a matrix for the 3rd dimension
                            
                                How to plot uploaded dataset using shiny?
                            
                                How to make graph color depend on two criteria in ggplot2?
                            
                                Set a header as the value of a variable in R markdown
                            
                                Adding scroll to sidebar in flexdashboard
                            
                                what is equivalent to do.call(rbind, list)?
                            
                                Rename variable names in dplyr based on vectors new_varname, old_varname [duplicate]
                            
                                See the specific color names from one existing palette in ggplot 2
                            
                                Given an element of a list, how do I recover its index inside the list?
                            
                                R:Inconsistent line thickness in geom_segment ggplot2
                            
                                For each value determining if another column contains larger or smaller number
                            
                                What is the equivalent of dplyr mutate and summarise in data table? [duplicate]
                            
                                set positive class to 1 in R
                            
                                ggplot: percentile lines by group automation
                            
                                do.call a function in R without loading the package [closed]
                            
                                How to compute diag(X %*% solve(A) %*% t(X)) efficiently without taking matrix inverse?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Confidence intervals for Ridge regression

Tags:

r

regression

glmnet

Ana Laura Carreiras

People also ask

1 Answers

ilanman

Recent Activity

Donate For Us