Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Addition of NA and expression that evaluates to NaN return different results depending on order, violation of the commutative property?

I am investigating corner cases of numeric operations in R. I came across the following particular case involving zero divided by zero:

(0/0)+NA
#> [1] NaN
NA+(0/0)
#> [1] NA

Created on 2021-07-10 by the reprex package (v2.0.0)

Session info
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.27     withr_2.4.2       magrittr_2.0.1    reprex_2.0.0     
#>  [5] evaluate_0.14     highr_0.9         stringi_1.6.2     rlang_0.4.11     
#>  [9] cli_3.0.0         rstudioapi_0.13   fs_1.5.0          rmarkdown_2.9    
#> [13] tools_4.1.0       stringr_1.4.0     glue_1.4.2        xfun_0.23        
#> [17] yaml_2.2.1        compiler_4.1.0    htmltools_0.5.1.1 knitr_1.33

This clearly violates the commutative property of addition. I have two questions:

  1. Is there an explanation of this behavior based on the R language definition?

  2. Are there other examples of the violation of the commutative property of addition (including in other languages) that don't involve side effects in the addend sub-expressions?

like image 698
Paul Gourdin Avatar asked Jul 10 '21 21:07

Paul Gourdin


2 Answers

Noting that

0/0
#[1] NaN

a more general example of the behavior of + in the question is the following:

NA + NaN
#[1] NA
 
NaN + NA
#[1] NaN

This is in a r-devel thread and R Core Team member Tomas Kalibera answers the following (my emphasis and link).

Yes, the performance overhead of fixing this at R level would be too large and it would complicate the code significantly. The result of binary operations involving NA and NaN is hardware dependent (the propagation of NaN payload) - on some hardware, it actually works the way we would like - NA is returned - but on some hardware you get NaN or sometimes NA and sometimes NaN. Also there are C compiler optimizations re-ordering code, as mentioned in ?NaN. Then there are also external numerical libraries that do not distinguish NA from NaN (NA is an R concept). So I am afraid this is unfixable. The disclaimer mentioned by Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical functions through which one might run into these problems that it would be infeasible to document them all. Some functions in fact will preserve NA, and we would not let NA turn into NaN unnecessarily, but the disclaimer says it is something not to depend on.

like image 106
Rui Barradas Avatar answered Sep 27 '22 22:09

Rui Barradas


According to ?NA, this could be because of NaN resulted from 0/0

Numerical computations using NA will normally result in NA: a possible exception is where NaN is also involved, in which case either might result (which may depend on the R platform). However, this is not guaranteed and future CPUs and/or compilers may behave differently. Dynamic binary translation may also impact this behavior (with valgrind, computations using NA may result in NaN even when no NaN is involved).

like image 45
akrun Avatar answered Sep 27 '22 21:09

akrun