I'm trying to use the glmnet
package on a dataset. I'm using cv.glmnet()
to get a lambda value for glmnet()
. Here's the dataset and error message:
> head(t2) X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 1 1 1 0.7661266 45 2 0.80298213 9120 13 0 6 0 2 2 2 0 0.9571510 40 0 0.12187620 2600 4 0 0 0 1 3 3 0 0.6581801 38 1 0.08511338 3042 2 1 0 0 0 4 4 0 0.2338098 30 0 0.03604968 3300 5 0 0 0 0 5 5 0 0.9072394 49 1 0.02492570 63588 7 0 1 0 0 6 6 0 0.2131787 74 0 0.37560697 3500 3 0 1 0 1 > str(t2) 'data.frame': 150000 obs. of 12 variables: $ X1 : int 1 2 3 4 5 6 7 8 9 10 ... $ X2 : int 1 0 0 0 0 0 0 0 0 0 ... $ X3 : num 0.766 0.957 0.658 0.234 0.907 ... $ X4 : int 45 40 38 30 49 74 57 39 27 57 ... $ X5 : int 2 0 1 0 1 0 0 0 0 0 ... $ X6 : num 0.803 0.1219 0.0851 0.036 0.0249 ... $ X7 : int 9120 2600 3042 3300 63588 3500 NA 3500 NA 23684 ... $ X8 : int 13 4 2 5 7 3 8 8 2 9 ... $ X9 : int 0 0 1 0 0 0 0 0 0 0 ... $ X10: int 6 0 0 0 1 1 3 0 0 4 ... $ X11: int 0 0 0 0 0 0 0 0 0 0 ... $ X12: int 2 1 0 0 0 1 0 0 NA 2 ... > cv1 <- cv.glmnet(t2[,-c(1,2,7,12)], t2[,2], family="multinomial") Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : (list) object cannot be coerced to type 'double'
I'm excluding columns 1,2,7,12 as they are: id column, response column, contain NA's, and contain NA's. Any suggestions would be great.
Example 1: Reproducing Error: (list) object cannot be coerced to type 'double. The reason is that we cannot simply apply the as. numeric function to a list containing multiple list elements.
One common error you may encounter in R is: Error: (list) object cannot be coerced to type 'double' This error occurs when you attempt to convert a list of multiple elements to numeric without first using the unlist() function.
cv.glmnet
expects a matrix of predictors, not a data frame. Generally you can obtain this via
X <- model.matrix(<formula>, data=<data>)
but in your case, you can probably get there more easily with
X <- as.matrix(t2[,-c(1,2,7,12)])
since you don't appear to have any factor variables or other issues that might complicate matters.
Since this answer is getting plenty of hits: the glmnetUtils package provides a formula-based interface to glmnet, like that used for most R modelling functions. It includes methods for glmnet
and cv.glmnet
, as well as a new cva.glmnet
function to do crossvalidation for both alpha and lambda.
The above would become
cv.glmnet(X2 ~ ., data=t2[-1], family="multinomial")
NA's are handled automatically, so you don't have to exclude columns with missing values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With