Why there is no possibility to pass only 1 explanatory variable to model in glmnet
function from glmnet
package when it is possible in glm
function from base?
Code and error are below:
> modelX<-glm( ifelse(train$cliks <1,0,1)~(sparseYY[,40]), family="binomial")
> summary(modelX)
Call:
glm(formula = ifelse(train$cliks < 1, 0, 1) ~ (sparseYY[, 40]),
family = "binomial")
Deviance Residuals:
Min 1Q Median 3Q Max
-0.2076 -0.2076 -0.2076 -0.2076 2.8641
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.82627 0.00823 -464.896 <2e-16 ***
sparseYY[, 40] -0.25844 0.15962 -1.619 0.105
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 146326 on 709677 degrees of freedom
Residual deviance: 146323 on 709676 degrees of freedom
AIC: 146327
Number of Fisher Scoring iterations: 6
> modelY<-glmnet( y =ifelse(train$cliks <1,0,1), x =(sparseYY[,40]), family="binomial" )
Błąd wif (is.null(np) | (np[2] <= 1)) stop("x should be a matrix with 2 or more columns")
Glmnet is a package that fits generalized linear and similar models via penalized maximum likelihood. The regularization path is computed for the lasso or elastic net penalty at a grid of values (on the log scale) for the regularization parameter lambda.
cv. glmnet() performs cross-validation, by default 10-fold which can be adjusted using nfolds. A 10-fold CV will randomly divide your observations into 10 non-overlapping groups/folds of approx equal size. The first fold will be used for validation set and the model is fit on 9 folds.
By default glmnet chooses the lambda. 1se . It is the largest λ at which the MSE is within one standard error of the minimal MSE. Along the lines of overfitting, this usually reduces overfitting by selecting a simpler model (less non zero terms) but whose error is still close to the model with the least error.
From glmnet documentation, dev. ratio is The fraction of (null) deviance explained (for "elnet", this is the R-square).
Here is an answer I got to this question from the maintainer of the package (Trevor Hastie):
glmnet is designed to select variables from a (large) collection. Allowing for 1 variable would have created a lot of edge case programming, and I was not interested in doing that. Sorry!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With