In R, how can I get PRNG to give identical floating point numbers between platforms?

Tags:

Running the following code in R 4.1.1 gives different results between platforms.

set.seed(1)
x <- rnorm(3)[3]
print(x, 22)

# -0.83562861241004716      # intel windows
# -0.8356286124100471557341 # m1 mac

print(round(x, 15), 22)
# -0.83562861241004704      # intel windows
# -0.8356286124100470447118 # m1 mac

I know the size of difference is below .Machine$double.eps and the extra digits do not carry meaningful information.

I am not happy with the fact that extra digits exist. How can I ensure exactly consistent results? Is there an RNG library that achieves this?

EDIT:

The bit representation is different.

set.seed(1)
x <- rnorm(100)
x <- sum(x)
SoDA::binaryRep(x)

.10101110001110000100001111110111000010011001011111011 # intel windows
.10101110001110000100001111110111000010011001011111110 # m1 mac

Bits are also different in runif(). This suggests that the uniform-to-gaussian conversion is not the only breaking point.

set.seed(1)
x <- runif(10000000)
x <- sum(x)
SoDA::binaryRep(x)

# kind = "Mersenne-Twister"
.10011000100101000110100110111100101000100000101100000 # intel windows
.10011000100101000110100110111100101000011111001100000 # m1 mac
# kind = "Wichmann-Hill"
.10011000100111111110101000100001001001010100000011011 # intel windows
.10011000100111111110101000100001001001010100001001010 # m1 mac
# kind = "Marsaglia-Multicarry"
.10011000100011100110000010000001011100011110100001110 # intel windows
.10011000100011100110000010000001011100011110001010000 # m1 mac
# kind = "Super-Duper"
.10011000100010011010010110100001000101100011101011110 # intel windows
.10011000100010011010010110100001000101100100001111101 # m1 mac
# kind = "Knuth-TAOCP-2002"
.10011000101000110101010111000111010011101001000101100 # intel windows
.10011000101000110101010111000111010011101001000101101 # m1 mac
# kind = "Knuth-TAOCP"
.10011000100110001011010011000001011001001110011111000 # intel windows
.10011000100110001011010011000001011001001110011111001 # m1 mac
# kind = "L'Ecuyer-CMRG"
.10011000100100010110100101101001011000000111010110101 # intel windows
.10011000100100010110100101101001011000001000010100001 # m1 mac

206

asked Oct 28 '21 22:10

LambdaPsi

1 Answers

(Comments from Oct. 29 and Nov. 2 moved here and edited.)

I should note that such subtle reproducibility issues with pseudorandom number generators (PRNGs) can occur when floating-point arithmetic is involved. For instance, Intel's instruction set architecture might make use of 80-bit extended precision for internal arithmetic. Extended precision, though, is only one way (among a host of others) that floating-point arithmetic might lead to non-reproducible pseudorandom numbers. Consider that Intel's and Arm's instruction set architectures are different enough to cause reproducibility issues. (If I understand, an Arm instruction set is what is used in Apple's M1 chip.)

By contrast, integer arithmetic has fewer reproducibility problems.

Thus, if bit-for-bit reproducibility matters to you, you should try to find an R language PRNG that uses only integer operations. (Indeed, computers generate pseudorandom floating-point numbers via integers, not the other way around, and most PRNGs produce integers, not floating-point numbers.)

For instance, for uniform variates, take the integer output of the Mersenne Twister algorithm without manipulating it. For Gaussian (and exponential) random variates, there is fortunately an algorithm by Karney that generates arbitrary-precision variates with only integer operations. Also, consider rational arithmetic built on underlying integer operations.

REFERENCES:

Karney, C.F.F., 2016. Sampling exactly from the normal distribution. ACM Transactions on Mathematical Software (TOMS), 42(1), pp.1-14.

140

answered Nov 10 '22 06:11

Peter O.

Related questions
                            
                                Tidying financial data with mixed decimal and grouping digits
                            
                                How to read data from google drive using R in colab?
                            
                                Is there any explicit guarantee that dplyr operations preserve row order?
                            
                                Comparison of two vectors resulted after simulation
                            
                                Function of function always returns 0-R
                            
                                Why do I get a segfault when calling my C++ function with .Call rather than .C?
                            
                                DiagrammeR - arrow problems
                            
                                R Flatten nested lists of different lengths (Google geocode API output) in R
                            
                                Can't plot ggplot2 objects created with R 3.x into R 4.x imported from a RDS file
                            
                                Pivoting data with varying width from wide to long with flexible call (to be used in loop)
                            
                                How to specify random coefficients priors in rstanarm?
                            
                                How to exact match two column values in entire Dataset using R
                            
                                Pairwise correlation from Dunnett's rank test
                            
                                Dropping a column from a data.frame causes unwanted loss of an attribute
                            
                                data.table switches column names
                            
                                Install with devtools::install_github() fails to detect build tools
                            
                                Vertically scrollable code with RStudio and xaringan
                            
                                Add multiple level x-label in ggplot2
                            
                                How to add to a cnetplot using ggplot functions?
                            
                                Parallelizing / Multithreading with data.table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In R, how can I get PRNG to give identical floating point numbers between platforms?

Tags:

random

floating-point

r

cross-platform

LambdaPsi

People also ask

1 Answers

Peter O.

Recent Activity

Donate For Us