Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interpreting Lasso regression p-values versus coefficients

Tags:

r

I'm wondering how I should interpret a lasso regression's output. Take for example:

library(lasso2)
lm.lasso <- l1ce(mpg ~ . , data=mtcars)
summary(lm.lasso)$coefficients

The output is:

              Value  Std. Error     Z score   Pr(>|Z|)
(Intercept) 36.01809203 18.92587647  1.90311355 0.05702573
cyl         -0.86225790  1.12177221 -0.76865686 0.44209704
disp         0.00000000  0.01912781  0.00000000 1.00000000
hp          -0.01399880  0.02384398 -0.58709992 0.55713660
drat         0.05501092  1.78394922  0.03083659 0.97539986
wt          -2.68868427  2.05683876 -1.30719254 0.19114733
qsec         0.00000000  0.75361628  0.00000000 1.00000000
vs           0.00000000  2.31605743  0.00000000 1.00000000
am           0.44530641  2.14959278  0.20715850 0.83588608
gear         0.00000000  1.62955841  0.00000000 1.00000000
carb        -0.09506985  0.91237207 -0.10420075 0.91701004

If I understand right, a lasso regression is supposed to basically minimize features that aren't that important to the model so their coefficients are essentially zero. That makes sense for the qsec, vs, and gear features. However, the p-values are all pretty insignificant.

If I have a coefficient that's basically zero, but the p-value is close to 1, which value should I trust? Should I discard the feature from the model since it's coefficient is zero, or discard it from the model since its p-value is insignificant?

like image 748
AI52487963 Avatar asked Sep 05 '25 16:09

AI52487963


1 Answers

The null hypothesis is that the variable coefficient is equal to Zero and has no effect on the model. In order to reject the null hypothesis, you need to have a p-value lower than .05, the smaller the value, the greater you confidence in REJECTING the null hypothesis.

So in evaluating a p-value if the value is 1.00, that means that there is NO CONFIDENCE in the rejection of the null hypothesis (that it is a zero influence coefficient).

So in your model, where the regression dropped the coefficient to zero, with a p-value of 1 it supports your understanding of how the lasso reduces the non-influencing values to a zero coefficient. You should trust both the zero and the one!

like image 188
sconfluentus Avatar answered Sep 07 '25 09:09

sconfluentus