cv.glmnet vs glmnet results; gauging explanatory power

Tags:

When estimating a lasso model via the glmnet package, I am wondering whether it is better to: (a) pull coefficients / predictions / deviance straight from the cv.fit object procured from cv.glmnet, or (b) use the minimum lambda from cv.glmnet to re-run glmnet and pull these objects from the glmnet process. (Please be patient -- I have a feeling that this is documented, but I'm seeing examples/tutorials of both online, and no solid logic for going one way or the other.)

That is, for coefficients, I can run (a):

cvfit = cv.glmnet(x=xtrain, y=ytrain, alpha=1, type.measure = "mse", nfolds = 20)
coef.cv <- coef(cvfit, s = "lambda.min")

Or I can afterwards run (b):

fit = glmnet(x=xtrain, y=ytrain, alpha=1, lambda=cvfit$lambda.min)
coef <- coef(fit, s = "lambda.min")

While these two processes select the same model variables, they do not produce identical coefficients. Similarly, I could predict via either of the following two processes:

prdct <- predict(fit,newx=xtest)
prdct.cv <- predict(cvfit, newx=xtest, s = "lambda.min")

And they predict similar but NOT identical vectors.

Last, I would have THOUGHT I could pull % deviance explained via either of the two methods:

percdev <- fit$dev.ratio
percdev.cv <- cvfit$glmnet.fit$dev.ratio[cvfit$cvm==mse.min.cereal]

But in fact, it is not possible to pull percdev.cv in this way, because if the lambda sequence used by cv.glmnet has less than 100 elements, the lengths of cvfit$glmnet.fit$dev.ratio and cvfit$cvm==mse.min.cereal don't match. So I'm not quite sure how to pull the minimum-lambda dev.ratio from cvfit$glmnet.fit.

So I guess I'm wondering which process is best, why, and how people normally pull the appropriate dev.ratio statistic. Thanks!

991

asked Nov 30 '17 21:11

Leah Bevis

1 Answers

As pointed out in the comments, it has to do with the lambda sequence provided, because if you look under the source code of cv.glmnet, it calls glmnet:::cv.glmnet.raw which in the first few lines, run glmnet() on the defined lambda values.

So we can use an example below:

library(mlbench)
data(BostonHousing)
data = BostonHousing
data$chas=as.numeric(data$chas)

cvfit = cv.glmnet(x=as.matrix(data[,-14]),y=data[,14])
coef.cv <- coef(cvfit, s = "lambda.min")

fit = glmnet(x=as.matrix(data[,-14]), y=data[,14], alpha=1, lambda=cvfit$lambda.min)
coef <- coef(fit, s = "lambda.min")

head(cbind(coef.cv,coef))
6 x 2 sparse Matrix of class "dgCMatrix"
                       1            1
(Intercept)  31.74123706  31.86654225
crim         -0.09834634  -0.09869320
zn            0.04144161   0.04158829
indus         .            .         
chas          2.68518774   2.68163334
nox         -16.30664523 -16.35459059

They are slightly different, and if you provide the lambda sequence used in cv.glmnet:

fit = glmnet(x=as.matrix(data[,-14]), y=data[,14], alpha=1, lambda=cvfit$lambda)
coef <- coef(fit, s = cvfit$lambda.min)
head(cbind(coef.cv,coef))

6 x 2 sparse Matrix of class "dgCMatrix"
                       1            1
(Intercept)  31.74123706  31.74123706
crim         -0.09834634  -0.09834634
zn            0.04144161   0.04144161
indus         .            .         
chas          2.68518774   2.68518774
nox         -16.30664523 -16.30664523

They are the same now. And the dev.ratio will match too:

fit$dev.ratio[fit$lambda==cvfit$lambda.min]
[1] 0.7401482
cvfit$glmnet.fit$dev.ratio[which.min(cvfit$cvm)]
[1] 0.7401482

answered Oct 05 '22 01:10

StupidWolf

Related questions
                            
                                Spark error - Decimal precision 39 exceeds max precision 38
                            
                                Error in col2rgb(d) : invalid color name in tweenr
                            
                                how to active R taskcallback after error
                            
                                Unicode in ggplot facet_wrap (en_US.UTF-8 locale)
                            
                                Standard errors discrepancies between SAS and R for GLM gamma distribution
                            
                                Ridge regression in glmnet in R; Calculating VIF for different lambda values using glmnet package
                            
                                Adding ggplot2 geoms to simple features plot
                            
                                Title of Subplots in Plotly
                            
                                Divide all rows by a reference row, by group
                            
                                How to select a row or column of RMatrix in RcppParallel
                            
                                Error in "missforest" in R
                            
                                In R, why do I get one millisecond difference between POSIXct and POSIXlt?
                            
                                Flexdashboard/plotly interaction results in odd scroll bar behavior
                            
                                using knitr/rmarkdown to produce outputs in multiple natural languages (from a data.frame)
                            
                                Hide printing statement in RMarkdown
                            
                                Missing "libsystem_darwin.dylib" after installing Xcode 9
                            
                                Error in R: The h5py Python package is required to save and load models
                            
                                Euclidean distance matrix performance between two shapes
                            
                                Html table output formatting when sending email from Microsoft Outlook using R
                            
                                How to correctly add map to raster image in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

cv.glmnet vs glmnet results; gauging explanatory power

Tags:

r

glmnet

Leah Bevis

People also ask

1 Answers

StupidWolf

Recent Activity

Donate For Us