I am investigating corner cases of numeric operations in R. I came across the following particular case involving zero divided by zero:
(0/0)+NA
#> [1] NaN
NA+(0/0)
#> [1] NA
Created on 2021-07-10 by the reprex package (v2.0.0)
Session infosessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.27 withr_2.4.2 magrittr_2.0.1 reprex_2.0.0
#> [5] evaluate_0.14 highr_0.9 stringi_1.6.2 rlang_0.4.11
#> [9] cli_3.0.0 rstudioapi_0.13 fs_1.5.0 rmarkdown_2.9
#> [13] tools_4.1.0 stringr_1.4.0 glue_1.4.2 xfun_0.23
#> [17] yaml_2.2.1 compiler_4.1.0 htmltools_0.5.1.1 knitr_1.33
This clearly violates the commutative property of addition. I have two questions:
Is there an explanation of this behavior based on the R language definition?
Are there other examples of the violation of the commutative property of addition (including in other languages) that don't involve side effects in the addend sub-expressions?
Noting that
0/0
#[1] NaN
a more general example of the behavior of +
in the question is the following:
NA + NaN
#[1] NA
NaN + NA
#[1] NaN
This is in a r-devel thread and R Core Team member Tomas Kalibera answers the following (my emphasis and link).
Yes, the performance overhead of fixing this at R level would be too large and it would complicate the code significantly. The result of binary operations involving NA and NaN is hardware dependent (the propagation of NaN payload) - on some hardware, it actually works the way we would like - NA is returned - but on some hardware you get NaN or sometimes NA and sometimes NaN. Also there are C compiler optimizations re-ordering code, as mentioned in ?NaN. Then there are also external numerical libraries that do not distinguish NA from NaN (NA is an R concept). So I am afraid this is unfixable. The disclaimer mentioned by Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical functions through which one might run into these problems that it would be infeasible to document them all. Some functions in fact will preserve NA, and we would not let NA turn into NaN unnecessarily, but the disclaimer says it is something not to depend on.
According to ?NA
, this could be because of NaN
resulted from 0/0
Numerical computations using NA will normally result in NA: a possible exception is where NaN is also involved, in which case either might result (which may depend on the R platform). However, this is not guaranteed and future CPUs and/or compilers may behave differently. Dynamic binary translation may also impact this behavior (with valgrind, computations using NA may result in NaN even when no NaN is involved).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With