I'm trying to run a LASSO on our dataset, and to do so, I need to convert non-numeric variables to numeric, ideally via a sparse matrix. However, when I try to use the Matrix command, I get the same error:
Error in asMethod(object) : invalid class 'NA' to dup_mMatrix_as_geMatrix
I thought this was due to NA's in my data, so I did an na.omit and got the same error. I tried again with a mini subset of my code and got the same error again:
> sparsecombined <- Matrix(combined1[1:10,],sparse=TRUE)
Error in asMethod(object) : invalid class 'NA' to dup_mMatrix_as_geMatrix
This is the data set I tried to convert with that last line of code:
Is there anything that jumps out that might prevent sparse conversion?
The easiest way to incorporate categorical variables into a LASSO is to use my glmnetUtils package, which provides a formula/data frame interface to glmnet.
glmnet(ArrDelay ~ ArrTime + uniqueCarrier + TailNum + Origin + Dest,
data=combined1, sparse=TRUE)
This automatically handles categorical vars via one-hot encoding (also known as dummy variables). It can also use sparse matrices if so desired.
I think the error is due to the fact that you have non-numeric data types in your matrix.
Perhaps first convert your nun-numeric columns like UniqueCarrier to binary vectors using one-hot encoding. And only then convert the matrix to sparse.
Here is my code that I used for that conversion:
# Convert Genre into binary variables
# Convert genreVector into a corpus in order to parse each text string into a binary vector with 1s representing the presence of a genre and 0s the absence
library(tm)
library(slam)
convertToBinary <- function(category) {
genreVector = category
genreVector = strsplit(genreVector, "(\\s)?,(\\s)?") # separate out commas
genreVector = gsub(" ", "_", genreVector) # combine DirectorNames with whitespaces
genreCorpus = Corpus(VectorSource(genreVector))
#dtm = DocumentTermMatrix(genreCorpus, list(dictionary=genreNames))
dtm = DocumentTermMatrix(genreCorpus)
binaryGenreVector = inspect(dtm)
return(binaryGenreVector)
#return(data.frame(binaryGenreVector)) # convert binaryGenreVector to dataframe
}
directorBinary = convertToBinary(x$Director)
directorBinaryDF = as.data.frame(directorBinary)
See nograpes answer in
recommenderlab, Error in asMethod(object) : invalid class 'NA' to dup_mMatrix_as_geMatrix
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With