I just read this interesting question about a random number generator that never generates the same value three consecutive times. This clearly makes the random number generator different from a standard uniform random number generator, but I'm not sure how to quantitatively describe how this generator differs from a generator that didn't have this property.
Suppose that you handed me two random number generators, R and S, where R is a true random number generator and S is a true random number generator that has been modified to never produce the same value three consecutive times. If you didn't tell me which one was R or S, the only way I can think of to detect this would be to run the generators until one of them produced the same value three consecutive times.
My question is - is there a better algorithm for telling the two generators apart? Does the restriction of not producing the same number three times somehow affect the observable behavior of the generator in a way other than preventing three of the same value from coming up in a row?
In this regard, Approximate Entropy (ApEn) is a statistical measure of the level of randomness of a data series which is based on counting patterns and their repetitions. Low levels of this statistic indicate the existence of many repeated patterns, and high values indicate randomness and unpredictability.
There are two phases to test the random number generator process. First you need a source of entropy[1] that is impossible to guess like the weather. Second you need a deterministic algorithm to expand the seed into a multitude of sequences for keys and the like. Testing usually starts with checking your entropy.
Two theoretical tests are the lattice test (Marsaglia, 1972) and the spectral test described by Knuth (1969, Section 3.3. 4).
It is necessary to test random numbers because the random numbers we generate are pseudo random numbers and not real and pseudo random number generator must generate a sequence of such random numbers which are uniformly distributed and they should not be correlated, they should not repeat itself.
As a consequence of Rice's Theorem, there is no way to tell which is which.
Proof: Let L be the output of the normal RNG. Let L' be L, but with all sequences of length >= 3 removed. Some TMs recognize L', but some do not. Therefore, by Rice's theorem, determining if a TM accepts L' is not decidable.
As others have noted, you may be able to make an assertion like "It has run for N steps without repeating three times", but you can never make the leap to "it will never repeat a digit three times." More appropriately, there exists at least one machine for which you can't determine whether or not it meets this criterion.
Caveat: if you had a truly random generator (e.g. nuclear decay), it is possible that Rice's theorem would not apply. My intuition is that the theorem still holds for these machines, but I've never heard it discussed.
EDIT: a secondary proof. Suppose P(X)
determines with high probability whether or not X
accepts L'
. We can construct an (infinite number of) programs F like:
F(x): if x(F), then don't accept L'
else, accept L'
P cannot determine the behavior of F(P)
. Moreover, say P
correctly predicts the behavior of G
. We can construct:
F'(x): if x(F'), then don't accept L'
else, run G(x)
So for every good case, there must exist at least one bad case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With