Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does RUnit change my random numbers?

Tags:

random

r

runit

In a unit test I start a helper function (generating test data) with:

set.seed(1)

I was developing the unit test interactively like this:

source('tests/runit.functions.R');test.something()

But then when I went to run the tests from my run_tests.R they failed. I narrowed it down to different random numbers, despite the set.seed(1) command! I added this line, just after set.seed(1):

print(sessionInfo());print("RANDOM SEED:");print(.Random.seed)

The really interesting part is the random seed is entirely different. In the batch script it is just three numbers:

501 1280795612 -169270483

Whereas in my interactive R session it is a 626-element monster:

[1]         403         624  -169270483  -442010614 ...
 ...
[617]   197184543    -2095148  ... -689249108

The first number, the 501 vs. 403, is the type of random number generator, apparently, but I couldn't track down the master list for what the numbers mean.

I think the core of my question is what is the best way to make sure my unit tests have reliable random number generation? A secondary question is troubleshooting advice: how do I track down which random number generator is being used (and more importantly) which code/package/setting decided to use that?

The sessionInfo is not looking very helpful, but it is showing some small differences. E.g. the inclusion of the TTR package is due to other unit tests being run. Here is sessionInfo output from the batch script, where the first line is #!/usr/bin/Rscript --slave:

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C              LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8     LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=C                LC_NAME=C                 LC_ADDRESS=C              LC_TELEPHONE=C            LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] methods   stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] TTR_0.21-1   xts_0.8-6    zoo_1.7-7    RUnit_0.4.26

loaded via a namespace (and not attached):
[1] grid_2.15.1    lattice_0.20-6

And here is the output from my interactive R session, which is started from the commandline with R --no-save:

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C              LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8     LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=C                LC_NAME=C                 LC_ADDRESS=C              LC_TELEPHONE=C            LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] xts_0.8-6    zoo_1.7-7    RUnit_0.4.26

loaded via a namespace (and not attached):
[1] grid_2.15.1    lattice_0.20-6 tools_2.15.1  
like image 925
Darren Cook Avatar asked Aug 15 '12 06:08

Darren Cook


1 Answers

It seems you are using the RUnit package for your unit tests. In this case, you need to be aware that RUnit uses a different default for the kind of random number generator (RNGkind).

From the RUnit manual, and the help for ?defineTestSuite:

defineTestSuite(name, dirs, testFileRegexp = "^runit.+\\.[rR]$",
  testFuncRegexp = "^test.+",  
  rngKind = "Marsaglia-Multicarry",
  rngNormalKind = "Kinderman-Ramage")

Notice that the default rngKind in RUnit is "Marsaglia-Multicarry".

However, in base R, the default RNGkind is "Mersenne-Twister". From ?RNGkind:

The currently available RNG kinds are given below. kind is partially matched to this list. The default is "Mersenne-Twister".


So, to match your interactive results with the results of RUnit, you need to set a different RNGkind, either in your interactive session or in your initial call to defineTestSuite.

like image 183
Andrie Avatar answered Nov 16 '22 00:11

Andrie