Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How different do random seeds need to be?

Tags:

random

Consider code like this (Python):

import random

for i in [1, 2, 3, 4]:
    random.seed(i)
    randNumbers = [random.rand() for i in range(100)] # initialize a list with 100 random numbers
    doStuff(randNumbers)

I want to make sure that randNumbers differ significantly from one call to another. Do I need to make sure the seed numbers differ significantly between the subsequent calls, or is it sufficient that the seeds are different (no matter how)?

To the pedants: please realize the above code is super-over-simplified

like image 379
David D Avatar asked Oct 12 '09 14:10

David D


People also ask

How do you choose a good random seed?

If you want your model to be able to be replicated later, simply get the current seed (most operating systems use processor clock time I think) and store it. Choosing a random seed because it performs best is completely overfitting/happenstance.

Can random seed be any number?

A random seed is a starting point in generating random numbers. A random seed specifies the start point when a computer generates a random number sequence. This can be any number, but it usually comes from seconds on a computer system's clock (Henkemans & Lee, 2001).

What is the seed value of a random number generator?

The seed value is a base value used by a pseudo-random generator to produce random numbers. The random number or data generated by Python's random module is not truly random; it is pseudo-random(it is PRNG), i.e., deterministic. The random module uses the seed value as a base to generate a random number.

What is a random seed and how do they allow for sampling reproducibility?

❓ What is a Random Seed? A random seed is used to ensure that results are reproducible. In other words, using this parameter makes sure that anyone who re-runs your code will get the exact same outputs. Reproducibility is an extremely important concept in data science and other fields.


3 Answers

Short answer: Avoid the re-seeding, as it doesn't buy you anything here. Long answer below.


That all depends on what exactly you need. In Common defects in initialization of pseudorandom number generators it is outlined that linear dependent seeds (which 1, 2, 3, 4 definitely are) are a bad choice for initializing multiple PRNGs, at least when used for simulation and desiring uncorrelated results.

If all you do is rolling a few dice, or generating some pseudo-random input for something uncritical, then it very likely doesn't matter.

Note also that using some classes of a PRNG itself for generating seeds have the same problem in generating linear dependent numbers (LCGs spring to mind).

like image 168
Joey Avatar answered Sep 21 '22 07:09

Joey


If your random number generator is high quality, it shouldn't matter how you seed it. In fact, the best practice would be to seed it only once. Random number generators are designed to have certain statistical behavior once they're started. Frequently reseeding effectively creates a different random number generator, one that may not be as good.

Randomly selecting seeds sounds like a good idea, but it isn't. In fact, because of the "birthday paradox," there's a surprisingly high probability that you'll pick the same seed twice.

like image 33
John D. Cook Avatar answered Sep 20 '22 07:09

John D. Cook


Generally speaking, you only seed your random number generator when you need the random numbers to be generated in identical fashion each time through. This is useful when you have a random component to your processing, but need to test it and therefore want it to be consistent between tests. Otherwise, you let the system seed the generator itself.

In otherwords, by seeding the random number generator with specific pre-defined seeds, you are actually reducing the randomness of the system as a whole. The random numbers generated when using a seed of 1 are indeed psuedo-randomly different from that with a seed of 2, but a hard coded seed will result in repeated random sequences in each run of the program.

like image 41
Matt Avatar answered Sep 22 '22 07:09

Matt