sklearn random state not random

Tags:

I have been playing around with the random state variable from StratifiedKFold in sklearn, but it does not seem to be random. I believe that setting random_state=5, should give me a different testing set then setting random_state=4, but this does not seem to be the case. I have created some crude reproducible code below. First I load my data:

import numpy as np
from sklearn.cross_validation import StratifiedKFold
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target

Then I set random_state=5, for which I store the last values:

skf=StratifiedKFold(n_splits=5,random_state=5)
for (train, test) in skf.split(X,y): full_test_1=test
full_test_1

array([ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  90,  91,  92,
        93,  94,  95,  96,  97,  98,  99, 140, 141, 142, 143, 144, 145,
       146, 147, 148, 149])

Doing the same procedure for random_state=4:

skf=StratifiedKFold(n_splits=5,random_state=4)
for (train, test) in skf.split(X,y): full_test_2=test
full_test_2

array([ 40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  90,  91,  92,
        93,  94,  95,  96,  97,  98,  99, 140, 141, 142, 143, 144, 145,
       146, 147, 148, 149])

I can then check that they are equal:

np.array_equal(full_test_1,full_test_2)
True

I do not think that the two random states should be returning the same numbers. Is there a flaw in my logic or code?

442

asked May 17 '17 15:05

Bobe Kryant

1 Answers

From the linked docs

random_state : None, int or RandomState

When shuffle=True, pseudo-random number generator state used for shuffling. If None, use default numpy RNG for shuffling.

You aren't setting shuffle=True in your call to StratifiedKFold, so random_state won't do anything.

171

answered Nov 03 '22 03:11

Personman

Related questions
                            
                                Python 3.4 crashes when producing some – but not all – Cartopy maps with segmentation fault 11
                            
                                How print every line of a python script as its being executed (including the console)?
                            
                                Semantics of `async for` - can __anext__ calls overlap?
                            
                                spark importing data from oracle - java.lang.ClassNotFoundException: oracle.jdbc.driver.OracleDriver
                            
                                How to run python programs in visual studio code in virtualenv
                            
                                Add datashader image to matplotlib subplots
                            
                                Cannot Upgrade from python 3.5.2 to 3.6
                            
                                Node.js scraping with chrome-remote-interface
                            
                                How does 'global' behave under an if statement?
                            
                                Difference between Python 3.7 math.remainder and %(modulo operator)
                            
                                Is it possible to get the objective function value during each training step?
                            
                                Change bar color in a 3D bar plot in matplotlib based on value
                            
                                Update/delete confluence page using python code
                            
                                Python smtplib has no attribute SMTP_SSL
                            
                                Command Line Varaible is not overriding Suite Level Variable in Robot Framework
                            
                                Why codecs.iterdecode() eats empty strings?
                            
                                Scrapy shell return without response
                            
                                Displaying R ggplots inline in jupyter notebooks
                            
                                Keras input shape error
                            
                                session auth in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

sklearn random state not random

Tags:

python

random

scikit-learn

cross-validation

Bobe Kryant

People also ask

1 Answers

Personman

Recent Activity

Donate For Us