I'm using the tm package to apply stemming, and I need to convert the resulting data into a data frame. A solution for this can be found here R tm package vcorpus: Error in converting corpus to data frame, but in my case I have the content of the corpus as:
[[2195]]
i was very impress
instead of
[[2195]]
"i was very impress"
and because of this, if I apply
data.frame(text=unlist(sapply(mycorpus, `[`, "content")), stringsAsFactors=FALSE)
the result will be
<NA>.
Any help is much appreciated!
Code below as an example:
sentence <- c("a small thread was loose on the sandals, otherwise it looked good")
mycorpus <- Corpus(VectorSource(sentence))
mycorpus <- tm_map(mycorpus, stemDocument, language = "english")
inspect(mycorpus)
[[1]]
a small thread was loo on the sandals, otherwi it look good
data.frame(text=unlist(sapply(mycorpus, `[`, "content")), stringsAsFactors=FALSE)
text
1 <NA>
By applying
gsub("http\\w+", "", mycorpus)
the output has class = character, so it works in my case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With