Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sprintf seems to ignore some special characters

Tags:

r

Is this a bug?

> nchar(sprintf("%-20s", "Sao Paulo"))
[1] 20
> nchar(sprintf("%-20s", "São Paulo"))
[1] 19

> sessionInfo()
R version 3.2.4 (2016-03-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.4 (El Capitan)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.4    fortunes_1.5-2
like image 548
geotheory Avatar asked Apr 08 '16 13:04

geotheory


2 Answers

> nchar(sprintf("%-20s", "Sao Paulo"), type = "bytes")
[1] 20
> nchar(sprintf("%-20s", "São Paulo"), type = "bytes")
[1] 20
like image 187
TheRimalaya Avatar answered Sep 27 '22 04:09

TheRimalaya


If you read the help page of sprintf, it talks about the fact Encodings are important. If you look at the help page of nchar, you also learn that there are different types.

As a consequence, I see the following (on Linux, R 3.3.0 beta):

> nchars <- function(x) vapply(c("bytes","chars","width"),
                               function(typ) nchar(x, type=typ), 1)
> sp <- "São Paulo"
> Encoding(sp)
[1] "UTF-8"
> nchars(sp)
bytes chars width 
   10     9     9 
> nchars(sprintf("%-20s", sp))
bytes chars width 
   20    19    19 
> 

So I'm claiming there is no bug at all. I'm not saying much more than @TheRimalaya but am drawing a different conclusion

like image 22
Martin Mächler Avatar answered Sep 27 '22 04:09

Martin Mächler