I have created a script below for converting unicode into chinese characters, the last string in temp.df[,"name_unicode"]
is "§®£" (without quote), so that people not knowing chinese can also help.
library(RODBC)
library(Unicode)
temp.df <- data.frame(name_unicode=c("陳大文",
"陳小敏",
"陳一山",
"§®£"),
stringsAsFactors=FALSE)
temp.df[,"name_unicode_mod"] <- sapply(temp.df[,"name_unicode"],
function(x) {
temp <- unlist(strsplit(x,";"))
temp <- sprintf("%x",as.integer(gsub("[^0-9]","",temp)))
temp <- intToUtf8(as.u_char_range(temp))
return(temp)
})
write.csv(temp.df,file("test.csv",encoding="UTF-8"),row.names=FALSE)
The output for temp.df[,"name_unicode_mod"]
is OK for R console. But I need to export them out in csv
or xls
format. I tried write.csv
, write.table
, odbcConnectExcel
in RODBC
but all gives me something like <U+00A7><U+00AE><U+00A3>
.
Can anyone help? Thanks.
P.S. I am using R 3.0.0 and Win7
Using a binary writing will work for your case. The following is a small sample code to do.
writeUtf8csv <- function(x, file) {
con <- file(file, "wb")
apply(x, 1, function(a) {
b <- paste(paste(a, collapse=','), '\r\n', sep='')
writeBin(charToRaw(b), con, endian="little")
})
close(con)
}
More details are shown in this reference page.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With