When using function sort(x)
, where x
is a character, the letter "y" jumps into the middle, right after letter "i":
> letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" [21] "u" "v" "w" "x" "y" "z" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" [21] "t" "u" "v" "w" "x" "z"
The reason may be that I am located in Lithuania, and this is "lithuanian-like" sorting of letters, but I need normal sorting. How do I change the sorting method back to normal inside R code?
I'm using R 2.15.2 on Win7.
Normally the only punctuation marks that matter in alphabetizing are parentheses and commas, but in the case of titles with subtitles, it might make sense to promote the colon to primary importance. In that case, The Beatles: Rock Band would come first.
You need to change the locale that R is running in. Either do that for your entire Windows install (which seems suboptimal) or within the R sessions via:
Sys.setlocale("LC_COLLATE", "C")
You can use any other valid locale string in place of "C"
there, but that should get you back to the sort order for letters
you want.
Read ?locales
for more.
I suppose it is worth noting the sister function Sys.getlocale()
, which queries the current setting of a locale parameter. Hence you could do
(locCol <- Sys.getlocale("LC_COLLATE")) Sys.setlocale("LC_COLLATE", "lt_LT") sort(letters) Sys.setlocale("LC_COLLATE", locCol) sort(letters) Sys.getlocale("LC_COLLATE") ## giving: > (locCol <- Sys.getlocale("LC_COLLATE")) [1] "en_GB.UTF-8" > Sys.setlocale("LC_COLLATE", "lt_LT") [1] "lt_LT" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" [16] "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "z" > Sys.setlocale("LC_COLLATE", locCol) [1] "en_GB.UTF-8" > sort(letters) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" [16] "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" > Sys.getlocale("LC_COLLATE") [1] "en_GB.UTF-8"
which of course is what @Hadley's Answer shows with_collate()
doing somewhat more succinctly once you have devtools installed.
If you want to do this temporarily, devtools
provides the with_collate
function:
library(devtools) with_collate("C", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" # [20] "t" "u" "v" "w" "x" "y" "z" with_collate("lt_LT", sort(letters)) # [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "y" "j" "k" "l" "m" "n" "o" "p" "q" "r" # [20] "s" "t" "u" "v" "w" "x" "z"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With