How to source() .R file saved using UTF-8 encoding?

Tags:

The following, when copied and pasted directly into R works fine:

> character_test <- function() print("R同时也被称为GNU S是一个强烈的功能性语言和环境，探索统计数据集，使许多从自定义数据图形显示...") > character_test() [1] "R同时也被称为GNU S是一个强烈的功能性语言和环境,探索统计数据集,使许多从自定义数据图形显示..."

However, if I make a file called character_test.R containing the EXACT SAME code, save it in UTF-8 encoding (so as to retain the special Chinese characters), then when I source() it in R, I get the following error:

> source(file="C:\\Users\\Tony\\Desktop\\character_test.R", encoding = "UTF-8") Error in source(file = "C:\\Users\\Tony\\Desktop\\character_test.R", encoding = "utf-8") :    C:\Users\Tony\Desktop\character_test.R:3:0: unexpected end of input 1: character.test <- function() print("R 2:    ^ In addition: Warning message: In source(file = "C:\\Users\\Tony\\Desktop\\character_test.R", encoding = "UTF-8") :   invalid input found on input connection 'C:\Users\Tony\Desktop\character_test.R'

Any help you can offer in solving and helping me to understand what is going on here would be much appreciated.

> sessionInfo() # Windows 7 Pro x64 R version 2.12.1 (2010-12-16) Platform: x86_64-pc-mingw32/x64 (64-bit)  locale: [1] LC_COLLATE=English_United Kingdom.1252  [2] LC_CTYPE=English_United Kingdom.1252    [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C                            [5] LC_TIME=English_United Kingdom.1252      attached base packages: [1] stats     graphics  grDevices utils     datasets  methods   [7] base       loaded via a namespace (and not attached): [1] tools_2.12.1

and

> l10n_info() $MBCS [1] FALSE  $`UTF-8` [1] FALSE  $`Latin-1` [1] TRUE  $codepage [1] 1252

204

asked Feb 17 '11 16:02

Tony Breyal

1 Answers

On R/Windows, source runs into problems with any UTF-8 characters that can't be represented in the current locale (or ANSI Code Page in Windows-speak). And unfortunately Windows doesn't have UTF-8 available as an ANSI code page--Windows has a technical limitation that ANSI code pages can only be one- or two-byte-per-character encodings, not variable-byte encodings like UTF-8.

This doesn't seem to be a fundamental, unsolvable problem--there's just something wrong with the source function. You can get 90% of the way there by doing this instead:

eval(parse(filename, encoding="UTF-8"))

This'll work almost exactly like source() with default arguments, but won't let you do echo=T, eval.print=T, etc.

117

answered Sep 28 '22 09:09

Joe Cheng

Related questions
                            
                                Proposing feature requests to the R Core Team [closed]
                            
                                How to combine 2 plots (ggplot) into one plot?
                            
                                Setting document title in Rmarkdown from parameters
                            
                                select columns based on multiple strings with dplyr contains()
                            
                                inst and extdata folders in R Packaging
                            
                                Merge or combine by rownames
                            
                                Add empty columns to a dataframe with specified names from a vector
                            
                                changing title in multiplot ggplot2 using grid.arrange
                            
                                Euclidean distance of two vectors
                            
                                R dplyr: rename variables using string functions
                            
                                How to solve the error " missing required header GL/gl.h" while installing the Package mvoutlier in R?
                            
                                Colour points in a plot differently depending on a vector of values
                            
                                remove the last element of a vector
                            
                                controlling the output with RApacheOutputErrors
                            
                                Multiple functions in one .Rd file
                            
                                How can I add freehand red circles to a ggplot2 graph?
                            
                                What is R's multidimensional equivalent of rbind and cbind?
                            
                                How to flatten a list to a list without coercion?
                            
                                Number formatting axis labels in ggplot2?
                            
                                Include levels of zero count in result of table()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to source() .R file saved using UTF-8 encoding?

Tags:

file-io

r

encoding

utf-8

internationalization

Tony Breyal

People also ask

1 Answers

Joe Cheng

Recent Activity

Donate For Us