Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replicating seed setting from Stata

I'm trying to replicate in R a bit of code someone else wrote in Stata, and have hit a wall trying to predict the behavior of their p-RNG.

Their code has this snippet:

set seed 123456

Unfortunately, it's a bit nebulous exactly the algorithm used by Stata. This question suggests it's a KISS algorithm, but didn't manage to replicate in the end (and some of the links there seem to be dead/outdated). And the manual from Stata for set seed doesn't mention anything about algorithms. This question as well doesn't seem to have been completed.

Is it a fool's errand to try and replicate Stata's random numbers?

I don't know which version of Stata was used to create this.

like image 231
MichaelChirico Avatar asked Mar 23 '16 17:03

MichaelChirico


People also ask

What does Runiform mean in Stata?

Stata in fact has ten random-number functions: runiform() generates rectangularly (uniformly) distributed random number over [0,1). rbeta(a, b) generates beta-distribution beta(a, b) random numbers.

What is seed simulation?

A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator. For a seed to be used in a pseudorandom number generator, it does not need to be random.

What is random seed parameter?

Python Random seed() Method The seed() method is used to initialize the random number generator. The random number generator needs a number to start with (a seed value), to be able to generate a random number. By default the random number generator uses the current system time.

What is setting seed?

set seed # specifies the initial value of the random-number seed used by the random-number functions, such as runiform() and rnormal(). set seed statecode resets the state of the random-number functions to the value specified, which is a state previously obtained from creturn value c(seed).


1 Answers

In short: Yes, it is a fool's errand.

Stata, being a proprietary software, hasn't released all of the details of its core components, like its random number generator. However, documentation is available (link for Stata 14), most pertinently:

runiform() is the basis for all the other random-number functions because all the other random- number functions transform uniform (0, 1) random numbers to the specified distribution.

runiform() implements the Mersenne Twister 64-bit (MT64) and the “keep it simple stupid” 32-bit (KISS32) algorithms for generating uniform (0, 1) random numbers. runiform() uses the MT64 algorithm by default.

runiform() uses the KISS32 algorithm only when the user version is less than 14 or when the random-number generator has been set to kiss32...

Recall also from ?Random in R that for Mersenne twister:

The ‘seed’ is a 624-dimensional set of 32-bit integers plus a current position in that set.

Stata internally controls the 624-dimensional set, which should be nearly impossible to guess.

I suggest you export these random numbers from Stata and read them into a vector/matrix/etc. in R using

library(haven)
mydata <- read_dta("mydata.dta")
like image 104
Chris Conlan Avatar answered Oct 22 '22 22:10

Chris Conlan