I am playing around with PRNGs (like Mersenne Twister and <code>rand()</code> function of stdlib) and I would want a good test that would help me ascertain the quality of random data produced by the PRNGs. I have calculated the value of Pi using random numbers generated by the PRNGs, and I find <code>rand()</code> and Mersenne Twister to be very close to offer a distinction (do I need to scrutinize after 10 decimal points?). I do not have much idea about Monte Carlo simulations; please let me know about some algorithm/application (possibly something simple yet which could provide good inferences) that would help me distinguish them in terms of quality. <hr> EDIT 1: I didn't notice before, but there is a similar thread: How to test random numbers? EDIT 2: I am not able to interprete the results of NIST, as mentioned in one of the comments. I got this idea of visually interpreting the pattern (if any) from random.org and am following that because of it's simplicity. I would be very glad if someone could comment on the process of my testing: <ol> <li>Generate N randoms from [0,1] using rand() and MT1997</li> <li>if <code>(round(genrand_real1() / rand_0_1()))</code> then red pixel, else black</li> </ol> As I understand that this is not a very precise solution, but if this provides a reasonable estimate, then I could live with this at the present moment.

There are several statistical tests suites available. I wrote, copied, and otherwise gathered together 120 PRNGs and tested each with a variety of tests suites given 4 hours per PRNG per test suite: <ul> <li> PractRand (standard, 1 terabyte) found bias in 78 PRNGs</li> <li> TestU01 (BigCrush) found bias in 50 PRNGs</li> <li> RaBiGeTe (extended, 512 megabit, x1) found bias in 40 PRNGs</li> <li> Dieharder (custom command line options) found bias in 25 PRNGs</li> <li> Dieharder (-a command line option) found bias in 13 PRNGs</li> <li> NIST STS (default, 64 megabit x128) found bias in 11 PRNGs</li> </ul> How many of those were in PRNGs that the other test suites all missed? <ul> <li>PractRand (standard, 1 terabyte) found 22 unique biases, in a wide variety of categories. </li> <li>RaBiGeTe (extended, 512 megabit, x1) found 5 unique biases, all in LCGs and combination LCGs. </li> <li>TestU01 BigCrush found 2 unique biases, both in small chaotic PRNGs. No other test suite found any unique biases. </li> </ul> In short, only PractRand, TestU01, and possibly RaBiGeTe are worth using. Full disclosure: I wrote PractRand, so either the set of PRNGs or any other non-qualitative measure could be biased in its favor. Miscellaneous advantages: <ul> <li>PractRand and TestU01 tend to be the easiest to interpret the output of in my opinion.</li> <li>PractRand and Dieharder tend to be the easiest to automate testing for via command line interface I think.</li> <li>PractRand and RaBiGeTe were the only ones to support multithreaded testing. </li> </ul> Miscellaneous disadvantages: <ul> <li>PractRand required more bits of input to test than other test suites - could be a problem if your RNG is very slow or otherwise limited on amount of data produced.</li> <li>RaBiGeTe and NIST STS both have interface issues.</li> <li>Dieharder and NIST STS both have false-positive issues. </li> <li>NIST STS had the worst interface in my opinion.</li> <li>I could not get Dieharder to compile on Windows. I managed to get TestU01 to compile on windows but it took some work.</li> <li>Recent versions of RaBiGeTe are closed-source and windows-only. </li> </ul> The set of PRNGs tested: The PRNG set includes 1 large GFSR, 1 large LFSR, 4 xorshift type PRNGs, 2 xorwow type PRNGs, 3 other not-quite-LFSR PRNGs. It includes 10 simple power-of-2 modulus LCGs (which discard low bits to reach acceptable quality levels), 10 power-of-2 modulus not-quite-LCGs, and 9 combination generators based primarily around LCGs and not-quite-LCGs. It includes 19 reduced strength versions of CSPRNGs, plus one full strength CSPRNG. Of those, 14 were based upon indirection / dynamic s-boxes (e.g. RC4, ISAAC), four were ChaCha/Salsa parameterizations, and the remaining 2 were Trivium variants. It includes 11 PRNGs broadly classifiable as LFib-type or similar, not counting the LFSRs/GFSRs. The rest (about 35) were small state chaotic PRNGs, of which 10 used multiplication and the others were limited to arithmetic and bitwise logic. Edit: There is also the test set in gjrand, which is very obscure and a little odd, but actually does extremely well. Also, all of the PRNGs tested are included as non-recommended PRNGs in PractRand.

Testing the quality of PRNGs

Tags:

random

mersenne-twister

montecarlo

I am playing around with PRNGs (like Mersenne Twister and rand() function of stdlib) and I would want a good test that would help me ascertain the quality of random data produced by the PRNGs. I have calculated the value of Pi using random numbers generated by the PRNGs, and I find rand() and Mersenne Twister to be very close to offer a distinction (do I need to scrutinize after 10 decimal points?).

I do not have much idea about Monte Carlo simulations; please let me know about some algorithm/application (possibly something simple yet which could provide good inferences) that would help me distinguish them in terms of quality.

EDIT 1: I didn't notice before, but there is a similar thread: How to test random numbers?

EDIT 2: I am not able to interprete the results of NIST, as mentioned in one of the comments. I got this idea of visually interpreting the pattern (if any) from random.org and am following that because of it's simplicity. I would be very glad if someone could comment on the process of my testing:

Generate N randoms from [0,1] using rand() and MT1997
if (round(genrand_real1() / rand_0_1())) then red pixel, else black

As I understand that this is not a very precise solution, but if this provides a reasonable estimate, then I could live with this at the present moment.

409

asked Mar 20 '12 01:03

Sayan

1 Answers

There are several statistical tests suites available. I wrote, copied, and otherwise gathered together 120 PRNGs and tested each with a variety of tests suites given 4 hours per PRNG per test suite:

PractRand (standard, 1 terabyte) found bias in 78 PRNGs
TestU01 (BigCrush) found bias in 50 PRNGs
RaBiGeTe (extended, 512 megabit, x1) found bias in 40 PRNGs
Dieharder (custom command line options) found bias in 25 PRNGs
Dieharder (-a command line option) found bias in 13 PRNGs
NIST STS (default, 64 megabit x128) found bias in 11 PRNGs

How many of those were in PRNGs that the other test suites all missed?

PractRand (standard, 1 terabyte) found 22 unique biases, in a wide variety of categories.
RaBiGeTe (extended, 512 megabit, x1) found 5 unique biases, all in LCGs and combination LCGs.
TestU01 BigCrush found 2 unique biases, both in small chaotic PRNGs.
No other test suite found any unique biases.

In short, only PractRand, TestU01, and possibly RaBiGeTe are worth using.

Full disclosure: I wrote PractRand, so either the set of PRNGs or any other non-qualitative measure could be biased in its favor.

Miscellaneous advantages:

PractRand and TestU01 tend to be the easiest to interpret the output of in my opinion.
PractRand and Dieharder tend to be the easiest to automate testing for via command line interface I think.
PractRand and RaBiGeTe were the only ones to support multithreaded testing.

Miscellaneous disadvantages:

PractRand required more bits of input to test than other test suites - could be a problem if your RNG is very slow or otherwise limited on amount of data produced.
RaBiGeTe and NIST STS both have interface issues.
Dieharder and NIST STS both have false-positive issues.
NIST STS had the worst interface in my opinion.
I could not get Dieharder to compile on Windows. I managed to get TestU01 to compile on windows but it took some work.
Recent versions of RaBiGeTe are closed-source and windows-only.

The set of PRNGs tested: The PRNG set includes 1 large GFSR, 1 large LFSR, 4 xorshift type PRNGs, 2 xorwow type PRNGs, 3 other not-quite-LFSR PRNGs. It includes 10 simple power-of-2 modulus LCGs (which discard low bits to reach acceptable quality levels), 10 power-of-2 modulus not-quite-LCGs, and 9 combination generators based primarily around LCGs and not-quite-LCGs. It includes 19 reduced strength versions of CSPRNGs, plus one full strength CSPRNG. Of those, 14 were based upon indirection / dynamic s-boxes (e.g. RC4, ISAAC), four were ChaCha/Salsa parameterizations, and the remaining 2 were Trivium variants. It includes 11 PRNGs broadly classifiable as LFib-type or similar, not counting the LFSRs/GFSRs. The rest (about 35) were small state chaotic PRNGs, of which 10 used multiplication and the others were limited to arithmetic and bitwise logic.

Edit: There is also the test set in gjrand, which is very obscure and a little odd, but actually does extremely well.

Also, all of the PRNGs tested are included as non-recommended PRNGs in PractRand.

answered Dec 17 '22 21:12

user3535668

Related questions
                            
                                Can I generate cryptographically secure random data from a combination of random_device and mt19937 with reseeding?
                            
                                Determinism in tensorflow gradient updates?
                            
                                Best way to revert to a random seed after temporarily fixing it?
                            
                                c++ generate a good random seed for psudo random number generators
                            
                                Why does my SQL query fail?
                            
                                What are the chances that JavaScript Math.Random() will create the same number twice in a row?
                            
                                permute/scramble arraylist elements in java
                            
                                shuffle order of images in php
                            
                                Does this simple shuffling algorithm return a randomly shuffled deck of playing cards? [closed]
                            
                                Randomly permute rows/columns of a matrix with eigen
                            
                                same value for seed used to create java Random on two machines
                            
                                How to generate a random graph given the number of nodes and edges?
                            
                                How can I get a single random number from multiple possible ranges?
                            
                                Unique random token generation in grails
                            
                                How can I avoid value errors when using numpy.random.multinomial?
                            
                                Distribution of Random Numbers
                            
                                Using Python Faker generate different data for 5000 rows
                            
                                Why isn't randomized probing more popular in hash table implementations?
                            
                                Add a random number between 30 and 300 to an existing field
                            
                                A list of random doubles within defined range in Haskell?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With