Each row of the input matrix needs to contain at least one non-zero entry

Question

I have this issue when I run this chunk of code

text_lda <- LDA(text_dtm, k = 2, method = "VEM", control = NULL)

I have the next mistake "Each row of the input matrix needs to contain at least one non-zero entry"

Then I tried to solve this with these lines

row_total = apply(text_dtm, 1, sum)
empty.rows <- text_dtm[rowTotals == 0, ]$dimnames[1][[1]]

But I got the next issue

cannot allocate vector of size 3890.8 GB

This is the size of my DTM:

DocumentTermMatrix documents: 1968850, terms: 265238
Non-/sparse entries: 29766814/522184069486
Sparsity           : 100%
Maximal term length: 4000
Weighting          : term frequency (tf)

captcoma · Accepted Answer

Try this:

empty.rows <- text_dtm[rowTotals == 0, ]$dimnames[1][[1]] 
corpus_new <- corpus[-as.numeric(empty.rows)]

Or use tm to generate the dtm and then:

ui = unique(text_dtm$i)
text_dtm.new = text_dtm[ui,]

Each row of the input matrix needs to contain at least one non-zero entry

Tags:

memory

r

lda

topic-modeling

coding

1 Answers

captcoma

Recent Activity

Donate For Us

Each row of the input matrix needs to contain at least one non-zero entry

Tags:

memory

r

lda

topic-modeling

coding

1 Answers

captcoma

Related questions

Recent Activity

Donate For Us