is there any way to set seed on train_test_split on python sklearn. I have set the parameter random_state
to an integer, but I still can not reproduce the result.
Thanks in advance.
seed(). Seeds allow you to create a starting point for randomly generated numbers, so that each time your code is run the same answer is generated. The advantage of doing this in your sampling is that you or anyone else can recreate the exact same training and test sets by using the same seed.
1. Splitting data into training/validation/test sets: random seeds ensure that the data is divided the same way every time the code is run. 2.
With random_state=42 , we get the same train and test sets across different executions, but in this time, the train and test sets are different from the previous case with random_state=0 . The train and test sets directly affect the model's performance score.
The train-test split is used to estimate the performance of machine learning algorithms that are applicable for prediction-based Algorithms/Applications. This method is a fast and easy procedure to perform such that we can compare our own machine learning model results to machine results.
from sklearn.model_selection import train_test_split
x = [k for k in range(0, 10)]
y = [k for k in range(0, 10)]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=11)
print (x_train)
The above code will produce the same result for x_train every time I split the data. It is possible that the randomness is in your dataframe, not train_test_split.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With