Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using non-ASCII characters inside functions for packages

I'm trying to write a function equivalent to scales::dollar that adds a pound (£) symbol to the beginning of a figure. Since the scales code is so robust, I've used it as a framework and simply replaced the $ for the £.

A stripped-down function example:

pounds<-function(x) paste0("£",x)

When I run a CHECK I get the following:

Found the following file with non-ASCII characters:
  pounds.R
Portable packages must use only ASCII characters in their R code,
except perhaps in comments.
Use \uxxxx escapes for other characters.

Looking through the Writing R extensions guide it doesn't give a lot of help (IMO) on how to resolve this issue. It mentions the \uxxxx and says it refers to Unicode characters.

Looking up unicode characters yields me the code &#163 but the guidance I can find for \uxxxx is minimal and relates to Java on W3schools.

My question is thus:

How do you implement the usage of non-unicode characters in R functions using the \uxxxx escapes and how does the usage affect the display of such characters after the function has been used?

like image 359
Steph Locke Avatar asked Mar 06 '14 11:03

Steph Locke


People also ask

How do I use non-ASCII characters?

This is easily done on a Windows platform: type the decimal ascii code (on the numeric keypad only) while holding down the ALT key, and the corresponding character is entered. For example, Alt-132 gives you a lowercase "a" with an umlaut.

How do I allow non-ASCII characters in Python?

In order to use non-ASCII characters, Python requires explicit encoding and decoding of strings into Unicode. In IBM® SPSS® Modeler, Python scripts are assumed to be encoded in UTF-8, which is a standard Unicode encoding that supports non-ASCII characters.

Can JSON contain non-ASCII characters?

JSON allows for both escaped or non-escaped non-ascii characters.

What is \\ x00 -\\ x7F?

US-ASCII is a character set (and an encoding) with some notable features: Values are between 0–127 (x00–x7F) ASCII code-point 32 (decimal) represents a SPACE. ASCII code-point 65 represents the uppercase letter A.


3 Answers

For the \uxxxx escapes, you need to know the hexadecimal number of your character. You can determine it using charToRaw:

sprintf("%X", as.integer(charToRaw("£")))
[1] "A3"

Now you can use this to specify your non-ascii character. Both \u00A3 and £ represent the same character.

Another option is to use stringi::stri_escape_unicode:

library(stringi)
stringi::stri_escape_unicode("➛")
# "\\u279b"

This informs you that "\u279b" represents the character "➛".

like image 158
Karsten W. Avatar answered Oct 14 '22 12:10

Karsten W.


Try this:

pounds<-function(x) paste0("\u00A3",x)
like image 41
bartektartanus Avatar answered Oct 14 '22 14:10

bartektartanus


The stringi package can be useful is these situations:

library(stringi)

stri_escape_unicode("£")
#> [1] "\\u00a3"
like image 39
Jot eN Avatar answered Oct 14 '22 12:10

Jot eN