I'd like my script to create the same array of numbers each time I run the script. Earlier I was using np.random.seed(). For example:
np.random.seed(1)
X = np.random.random((3,2))
I've read that instead of np.random.seed() there should be used RandomState. But I have no idea how to use it, tried some combinations but none worked.
RandomState exposes a number of methods for generating random numbers drawn from a variety of probability distributions. In addition to the distribution-specific arguments, each method takes a keyword argument size that defaults to None . If size is None , then a single value is generated and returned.
With random_state=42 , we get the same train and test sets across different executions, but in this time, the train and test sets are different from the previous case with random_state=0 . The train and test sets directly affect the model's performance score.
If random_state is an integer, then it is used to seed a new RandomState object. This is to check and validate the data when running the code multiple times. Setting random_state a fixed value will guarantee that the same sequence of random numbers is generated each time you run the code.
random_state as the name suggests, is used for initializing the internal random number generator, which will decide the splitting of data into train and test indices in your case. In the documentation, it is stated that: If random_state is None or np. random, then a randomly-initialized RandomState object is returned.
It's true that it's sometimes advantageous to make sure you get your entropy from a specific (non-global) stream. Basically, all you have to do is to make a RandomState object and then use its methods instead of using numpy's random functions. For example, instead of
>>> np.random.seed(3)
>>> np.random.rand()
0.5507979025745755
>>> np.random.randint(10**3, 10**4)
7400
You could write
>>> R = np.random.RandomState(3)
>>> R
<mtrand.RandomState object at 0x7f79b3315f28>
>>> R.rand()
0.5507979025745755
>>> R.randint(10**3, 10**4)
7400
So all you need to do is make R
and then use R.
instead of np.random.
-- pretty simple. And you can pass R around as you want, and have multiple random streams (if you want a certain process to be the same while another changes, etc.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With