Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sort() produces different results in Ubuntu and Windows

Tags:

windows

r

ubuntu

I have a vector that is being sorted differently when I run the code on my Windows vs. Ubuntu remote server.

Windows:

> u <- getNodes(network)
> head(u)
[1] "-1336623650" "-1749477680" "539"         "-1036241023" "6135"              "-44987577"  
> uid <- sort(u)
> head(uid)
[1] "-1000019199" "-1000022360" "-1000039153" "-1000044219" "-1000069199" "-1000099640"

Ubuntu:

> u <- getNodes(network)
> head(u)
[1] "-1336623650" "-1749477680" "539"         "-1036241023" "6135"
[6] "-44987577"
> uid <- sort(u)
> head(uid)
[1] "10"          "100"         "1000"        "10000"       "-1000019199"
[6] "-1000022360"

Both implementations of R have the same packages loaded and are the same R version (3.3.1). Ubuntu is 13.10 and Windows is Windows 7.

like image 780
Lauren Fitch Avatar asked Dec 08 '22 21:12

Lauren Fitch


2 Answers

String sorting (which is what you are doing) in R is based on the "locale" which is different for Windows and Linux systems. But, do be careful. No locale will sort these strings in correct numerical order, you would have to sort a vector of numbers if you wanted numerical order.

Grab the value of Sys.getlocale("LC_COLLATE") from each system and compare them. For my package, I do the below at the entry point, and report it in packageStartupMessage.

collateOrigValue<-Sys.getlocale("LC_COLLATE")
on.exit(Sys.setlocale("LC_COLLATE",collateOrigValue), add=TRUE)
Sys.setlocale("LC_COLLATE","C")

See also https://stat.ethz.ch/R-manual/R-devel/library/base/html/locales.html

like image 123
Tod Casasent Avatar answered Dec 10 '22 10:12

Tod Casasent


Use stringi::stri_sort or stringr::str_sort for consistent sting sorting across operating systems.

like image 42
Ista Avatar answered Dec 10 '22 10:12

Ista