I'm new to R software
Now,studying text mining using "tm"package"
I have a ploblem on mapping text to lower case
sms_raw<-read.csv(............)
sms_corpus<-Corpus(VectorSource(sms_raw$text))
sms_corpus<-Corpus(VectorSource(sms_raw$text))
tm_map(sms_corpus,content_transformer(tolower))
error:invalid multubytes string 1
I thought my csv file could be not utf-8 so I restored as utf-8 but it didn't work.
my OS is win8.1
Anyone have solution on this problem please let me know.
The error I had easily solved by encoding function
In my file's column which name is text contains multibyte character
So I type
sms_raw$text <- iconv(enc2utf8(sms_raw$text),sub="byte")
This command converts the 'text' column (multibyte) to utf8 form
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With