Given a series of randomly generated data how can I figure out how random it actually is? Is R-lang a good tool for this matlab? What other questions can can these tools answer about randomly generated data? Is there another tool better for this?
A random number occurs in a specified distribution only when two conditions are met: The values are uniformly distributed over a defined interval or set, and it is impossible to predict future values based on past or present ones.
The term 'random' is widely used in social science in the context of sampling and statistical analysis. A sample is random if all of the cases in it were selected by chance from a larger set of cases known as the population, and if all of the cases in the population had a genuine chance of being in the sample.
Sets containing random numbers have no structure at all — and therefore are likely to contain numerical patterns.
The DieHarder test battery by Robert G. Brown --- which reimplements and extends the old DIEHARD by Marsaglia et al -- has been wrapped into the R package RDieHarder which you could start with.
Note that RDieHarder versions need their particular matching DieHarder releases -- and we're not there yet for the most recent development version of the latter.
Edit Also, for the subset of cryptographioic tests, the NIST suite (which is included in DieHarder) should be appropriate as that is what it was designed for.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With