Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: dangers of temporarily changing the random seed using a context manager?

When aiming for reproducibility in Python code using random number generators, the recommended approach seems to be to construct separate RandomState objects. Unfortunately, some essential packages like scipy.stats cannot (to the best of my knowledge) be set to use a specific RandomState and will just use the current state of numpy.random. My current workaround is to use a context manager which saves the state of the RNG and then resets it upon exiting as follows:

class FixedSeed:
    def __init__(self, seed):
        self.seed = seed
        self.state = None

    def __enter__(self):
        self.state = rng.get_state()
        np.random.seed(self.seed)

    def __exit__(self, exc_type, exc_value, traceback):
        np.random.set_state(self.state)

There are a lot of warnings in the documentation about changing the state in any way - is the above approach safe in general? (in the sense that the change is local to the context and that the rest of my code will be unaffected)

like image 366
Bonnevie Avatar asked Sep 20 '15 12:09

Bonnevie


1 Answers

The numpy documentation claims:

set_state and get_state are not needed to work with any of the random distributions in NumPy. If the internal state is manually altered, the user should know exactly what he/she is doing.

which does sound scary. A possible interpretation of this warning on a public, documented interface is that "know exactly" means "knows that reseeding a PRNG willy-nilly severely reduces the randomness". But you know that you want to reduce the randomness very specifically for the period of your context.

In support of this conjecture, I looked to numpy/test_random.py which contains code like:

class TestSeed(TestCase):
    def test_scalar(self):
        s = np.random.RandomState(0)
        assert_equal(s.randint(1000), 684)
        s = np.random.RandomState(4294967295)
        assert_equal(s.randint(1000), 419)

because there they do need deterministic results. Note that they create an instance of an np.random.RandomState but I could find no indication in the code that set_state() will break anything.

If in doubt, write a test suite which

  1. Seeds the default RNG to a fixed value
  2. Checks that the default RNG returns the same, expected values every time
  3. Uses your context manager
  4. confirm that a new sequence of values are generated
  5. confirm that the original seeded RNG from (1) continues to emit its expected sequence
like image 100
msw Avatar answered Oct 09 '22 07:10

msw