I am working on regular data.frame
that looks to be to big for glm
function so I've decided I'll work on a sparse represantation of a model matrix so I could put this sparse matrix into glmnet
function. But sparse.model.matrix
looks like to drops some rows from original matrix. Any idea why that happens and any solution how to avoid that?
Code below:
> mm <- sparse.model.matrix(~clicks01+kl_tomek*bc1+hours+plec+1,
data = daneOst)
> dim(mm)
[1] 1253223 292
> dim(daneOst)
[1] 1258836 6
I've had some success with changing the na.action
to na.pass
, this includes all the rows in my matrix:
options(na.action='na.pass')
Just note that this is a global option, so you probably want to set it back to it original value after, to not mess with the rest of your code.
previous_na_action <- options('na.action')
options(na.action='na.pass')
# Do your stuff...
options(na.action=previous_na_action$na.action)
Solution from this answer.
It's due to the NA's !
Run sum(complete.cases(mm))
. I bet it also gives you 1253223.
So replace the NA's in your dataframe by a value (eg. 'IMPUTED_NA' or -99999), and then try again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With