Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error converting text to lowercase with tm_map(..., tolower)

I tried using the tm_map. It gave the following error. How can I get around this?

 require(tm)
 byword<-tm_map(byword, tolower)

Error in UseMethod("tm_map", x) : 
  no applicable method for 'tm_map' applied to an object of class "character"
like image 684
jackStinger Avatar asked Nov 30 '12 06:11

jackStinger


3 Answers

Use the base R function tolower():

tolower(c("THE quick BROWN fox"))
# [1] "the quick brown fox"
like image 83
bdemarest Avatar answered Nov 17 '22 21:11

bdemarest


Expanding my comment to a more detailed answer here: you have to wrap tolower inside of content_transformer not to screw up the VCorpus object -- something like:

> library(tm)
> data('crude')
> crude[[1]]$content
[1] "Diamond Shamrock Corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    The reduction brings its posted price for West Texas\nIntermediate to 16.00 dlrs a barrel, the copany said.\n    \"The price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    Diamond is the latest in a line of U.S. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n Reuter"
> tm_map(crude, content_transformer(tolower))[[1]]$content
[1] "diamond shamrock corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    the reduction brings its posted price for west texas\nintermediate to 16.00 dlrs a barrel, the copany said.\n    \"the price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    diamond is the latest in a line of u.s. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n reuter"
like image 36
daroczig Avatar answered Nov 17 '22 20:11

daroczig


myCorpus <- Corpus(VectorSource(byword))
myCorpus <- tm_map(myCorpus , tolower)

print(myCorpus[[1]])
like image 3
Khuyagbaatar Batsuren Avatar answered Nov 17 '22 20:11

Khuyagbaatar Batsuren