I have this issue when I run this chunk of code
text_lda <- LDA(text_dtm, k = 2, method = "VEM", control = NULL)
I have the next mistake "Each row of the input matrix needs to contain at least one non-zero entry"
Then I tried to solve this with these lines
row_total = apply(text_dtm, 1, sum)
empty.rows <- text_dtm[rowTotals == 0, ]$dimnames[1][[1]]
But I got the next issue
cannot allocate vector of size 3890.8 GB
This is the size of my DTM:
DocumentTermMatrix documents: 1968850, terms: 265238
Non-/sparse entries: 29766814/522184069486
Sparsity : 100%
Maximal term length: 4000
Weighting : term frequency (tf)
Try this:
empty.rows <- text_dtm[rowTotals == 0, ]$dimnames[1][[1]]
corpus_new <- corpus[-as.numeric(empty.rows)]
Or use tm to generate the dtm and then:
ui = unique(text_dtm$i)
text_dtm.new = text_dtm[ui,]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With