I noticed that I can use some Greek letters as names while others will be either illegal or just aliases to letters from the latin alphabet.
Basically I can use β
or µ
(though β
is changed to ß
when printing and ß
and β
act as alliases)
list(β = 1)
# $ß
# [1] 1
list(μ = 1)
# $µ
# [1] 1
α, Γ, δ, ε, Θ, π, Σ, σ, τ, Φ, φ and Ω are allowed but act as alliases to latin letters.
list(α = 1)
# $a
# [1] 1
αa <- 42
aa
# [1] 42
GG <- 33
ΓΓ
# [1] 33
Other letters I've tested just don't "work":
ι <- 1
# Error: unexpected input in "\"
Λ <- 1
# Error: unexpected input in "\"
λ <- 1
#Error: unexpected input in "\"
I was surprised about λ
as it's defined by the package wrapr
's define_lambda
, so I assume this depends on the system.
I know similar or identical looking characters can have different encodings, and some of them don't go well with copy/paste between apps, the code of this question returns the described output when pasted back to RStudio.
?make.names
says :
A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number
So part of the question is : what's a letter ? and what's going on here ?
More specifically:
µ
and β
(or ß
) safe to use in a package.λ
( intToUtf8(955)
) usable on my system while it seems to be commonly use by wrapr
's users.ø
looks cool and seems to work on my system)This all was prompted by the fact I'm looking for a one (or 2) character function name that wouldn't conflict with an existing or commonly used name, and would look a bit funky. .
is already used a lot and I use ..
already as well.
from sessionInfo()
:
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
To make a Greek letter in R, You just use \ and then the name of the letter. If you want a subscript, like β1 , you use $\beta_1$ .
Adding Greek symbols to Plot Title In this method to use Greeks symbols in ggplot2 user need to call the expression function which is a base function of the R programming language, and pass the name of the Greek symbols to be used as the parameters to this function to get a Greek symbol to the ggplot2.
The letters of the Greek alphabet are: alpha, beta, gamma, delta, epsilon, zeta, eta, theta, iota, kappa, lambda, mu, nu1, xi, omicron, pi1, rho, sigma, tau, upsilon, phi, chi1, psi1, omega.
I'm not an expert by any means but let's try to analyze the problem. In the end, your R-code needs to be understood by the compiler therefore the source-code of make.names() may be helpful:
names <- as.character(names)
names2 <- .Internal(make.names(names, allow_))
if (unique) {
o <- order(names != names2)
names2[o] <- make.unique(names2[o])
}
names2
Now, .Internal() calls the R-interpreter (written in C) so we need to go a little deeper. The C-code responsible for handling the make.names() request can be found here: https://github.com/wch/r-source/blob/0dccb93e114b00b2fcbe75e8721f11a8f2ffdff4/src/main/character.c
A short snipped:
SEXP attribute_hidden do_makenames(SEXP call, SEXP op, SEXP args, SEXP env)
{
SEXP arg, ans;
R_xlen_t i, n;
int l, allow_;
char *p, *tmp = NULL, *cbuf;
const char *This;
Rboolean need_prefix;
const void *vmax;
checkArity(op ,args);
arg = CAR(args);
if (!isString(arg))
error(_("non-character names"));
n = XLENGTH(arg);
allow_ = asLogical(CADR(args));
if (allow_ == NA_LOGICAL)
error(_("invalid '%s' value"), "allow_");
PROTECT(ans = allocVector(STRSXP, n));
vmax = vmaxget();
for (i = 0 ; i < n ; i++) {
This = translateChar(STRING_ELT(arg, i));
l = (int) strlen(This);
/* need to prefix names not beginning with alpha or ., as
well as . followed by a number */
need_prefix = FALSE;
if (mbcslocale && This[0]) {
int nc = l, used;
wchar_t wc;
mbstate_t mb_st;
const char *pp = This;
mbs_init(&mb_st);
used = (int) Mbrtowc(&wc, pp, MB_CUR_MAX, &mb_st);
pp += used; nc -= used;
if (wc == L'.') {
if (nc > 0) {
Mbrtowc(&wc, pp, MB_CUR_MAX, &mb_st);
if (iswdigit(wc)) need_prefix = TRUE;
}
} else if (!iswalpha(wc)) need_prefix = TRUE;
} else {
if (This[0] == '.') {
if (l >= 1 && isdigit(0xff & (int) This[1])) need_prefix = TRUE;
} else if (!isalpha(0xff & (int) This[0])) need_prefix = TRUE;
}
if (need_prefix) {
tmp = Calloc(l+2, char);
strcpy(tmp, "X");
strcat(tmp, translateChar(STRING_ELT(arg, i)));
} else {
tmp = Calloc(l+1, char);
strcpy(tmp, translateChar(STRING_ELT(arg, i)));
}
if (mbcslocale) {
/* This cannot lengthen the string, so safe to overwrite it. */
int nc = (int) mbstowcs(NULL, tmp, 0);
if (nc >= 0) {
wchar_t *wstr = Calloc(nc+1, wchar_t);
mbstowcs(wstr, tmp, nc+1);
for (wchar_t * wc = wstr; *wc; wc++) {
if (*wc == L'.' || (allow_ && *wc == L'_'))
/* leave alone */;
else if (!iswalnum((int)*wc)) *wc = L'.';
}
wcstombs(tmp, wstr, strlen(tmp)+1);
Free(wstr);
} else error(_("invalid multibyte string %d"), i+1);
} else {
for (p = tmp; *p; p++) {
if (*p == '.' || (allow_ && *p == '_')) /* leave alone */;
else if (!isalnum(0xff & (int)*p)) *p = '.';
/* else leave alone */
}
}
// l = (int) strlen(tmp); /* needed? */
SET_STRING_ELT(ans, i, mkChar(tmp));
/* do we have a reserved word? If so the name is invalid */
if (!isValidName(tmp)) {
/* FIXME: could use R_Realloc instead */
cbuf = CallocCharBuf(strlen(tmp) + 1);
strcpy(cbuf, tmp);
strcat(cbuf, ".");
SET_STRING_ELT(ans, i, mkChar(cbuf));
Free(cbuf);
}
Free(tmp);
vmaxset(vmax);
}
UNPROTECT(1);
return ans;
}
As we can see, compiler-dependent datatypes such as wchar_t (http://icu-project.org/docs/papers/unicode_wchar_t.html) are used. This means that the behavior of make.names() depends on the C-compiler used to compile the R-interpreter itself. The problem is that C-compilers aren't very standardized therefore no assumption about the behavior of characters can be made. Everything including operating system, hardware, locale etc. can change this behavior.
In conclusion, I would stick to ASCII characters if you want to be save, especially when sharing your code between different operating systems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With