How do you prove that one RNG is better than another?
I don't mean in terms of runtime, but rather the amount of entropy "generated" - which also rolls in the notion of periodicity (low period = low entropy).
Can a RNG be provable optimal? Or is this an unobtainable goal? By optimal, I mean any sequence is equally likely and independent of past or future results.
I'm interested in algorithms, not cosmic background sampling devices or other sources of physical "randomness" (is it random or just complex?)
There are generally two types of random number generators: pseudorandom number generators and true random number generators. Pseudorandom number generator. Software-based PRNGs use algorithms to mimic the selection of a value and approximate true randomness.
Hypothesis: To test the run test of randomness, first set up the null and alternative hypothesis. In run test of randomness, null hypothesis assumes that the distributions of the two continuous populations are the same. The alternative hypothesis will be the opposite of the null hypothesis.
See http://www.random.org/analysis/
The one and only optimal RNG:
The National Institute of Standards and Technology has some good info on this:
http://csrc.nist.gov/groups/ST/toolkit/rng/index.html
Looks like there is a test suite and lots of good reference material
The old standard for testing used to be the "Diehard tests." http://en.wikipedia.org/wiki/Diehard_tests This has been superceded by the NIST tests that DKnight pointed out: http://csrc.nist.gov/groups/ST/toolkit/rng/index.html. The Diehard wiki article gives you a good overview of the kind of things that are looked at. The NIST bit will take a bit more digging.
As you state it, no pseudo RNG (algorithm) can be proven optimal. They all have a seed value and are dependent on the input to generate a value. If you know the seed and the state, you know the next value. As an example, check out http://en.wikipedia.org/wiki/Mersenne_twister. I like it mostly because of the awesome name, but the article does a good job explaining the tenets of a PRNG.
There is an objective definition from computational complexity theory: the asymptotic time complexity required to distinguish the output from random data.
A good PRNG should force a distinguisher to spend lots more time as the size of the seed increases or as the level of certainty required increases. (With fixed size seeds I suppose you'd look at the actual running time as well as the program complexity.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With