I currently use wordle for many artsy uses of the word cloud. I think that R's word cloud, potentially, has better control.
1) How do you keep a word capitalized in the word cloud? [SOLVED]
2) How do keep two words as one chunk in the wordcloud? (wordle uses the ~ operator to accomplish this, R's word cloud merely prints the ~ as is) [For instance where there's a ~ between "to" and "be" I'd like a space in the word cloud]
require(wordcloud)
y<-c("the", "the", "the", "tree", "tree", "tree", "tree", "tree",
"tree", "tree", "tree", "tree", "tree", "Wants", "Wants", "Wants",
"Wants", "Wants", "Wants", "Wants", "Wants", "Wants", "Wants",
"Wants", "Wants", "to~be", "to~be", "to~be", "to~be", "to~be",
"to~be", "to~be", "to~be", "to~be", "to~be", "to~be", "to~be",
"to~be", "to~be", "to~be", "to~be", "to~be", "to~be", "to~be",
"to~be", "when", "when", "when", "when", "when", "familiar", "familiar",
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar",
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar",
"familiar", "familiar", "familiar", "familiar", "familiar", "familiar",
"leggings", "leggings", "leggings", "leggings", "leggings", "leggings",
"leggings", "leggings", "leggings", "leggings")
wordcloud(names(table(y)), table(y))
You asked two questions:
TermDocumentMatrix
~
, but here is an easy workaround: Use gsub
to change ~
to white space in the step just before plotting.Some code:
corpus <- Corpus(VectorSource(y))
tdm <- TermDocumentMatrix(corpus, control=list(tolower=FALSE)) ## Edit 1
m <- as.matrix(tdm)
v <- sort(rowSums(m), decreasing = TRUE)
d <- data.frame(word = names(v), freq = v)
d$word <- gsub("~", " ", d$word) ## Edit 2
wordcloud(d$word, d$freq)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With