I often use numpy.random.seed() for reproducibility of results. I also find it helpful when books/tutorials use it (e.g., np.random.seed(42)), so that people can reproduce things more easily.
The the documentation for random.seed() now says it is not encouraged and is a legacy function, but without any explanation. I am wondering why:
random.seed()
Reseed a legacy MT19937 BitGenerator
This is a convenience, legacy function.
The best practice is to not reseed a BitGenerator,
rather to recreate a new one. This method is here
for legacy reasons.
Note the documentation goes on to say:
# This example demonstrates best practice.
from numpy.random import MT19937
from numpy.random import RandomState, SeedSequence
rs = RandomState(MT19937(SeedSequence(123456789)))
# Later, you want to restart the stream
rs = RandomState(MT19937(SeedSequence(987654321)))
I'm not sure why this is best practice. Why is that better than using random.seed()? In particular, when you are teaching a concept and want to reproduce the same result every time and want to keep your code simple? At least on the surface, random.seed() also invokes the MT19937 BitGenerator (though from the docs it is a legacy bit generator, so they are not the same).
The best practice is to make a new BitGenerator rather than reseed one.
Using np.random.seed(number) sets a global seed, which affects all uses np.random.* uses, but that could interact with other packages/scripts if they also run np.random.seed(another_number), which will reset the global random seed. Then you won't be able to achieve the reproducibility you're looking for. Additionally, this would probably happen under the hood, making the cause unapparent.
As mentioned in the comments, you can read about it in the new policy. The new policy assures more consistent results across different OS and build methods, as the old one now accessible like this is now legacy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With