sklearn kfold returning wrong indexes in python

Tags:

scikit-learn

I am using kfold function from sklearn package in python on a df (data frame) with non-contious row indexes.

this is the code:

kFold = KFold(n_splits=10, shuffle=True, random_state=None)
for train_index, test_index in kFold.split(dfNARemove):...

I get some train_index or test_index that doesn't exist in my df.

what can I do?

320

asked Oct 08 '17 16:10

1 Answers

kFold iterator yields to you positional indices of train and validation objects of DataFrame, not their non-continuous indices. You can access your train and validation objects by using .iloc pandas method:

kFold = KFold(n_splits=10, shuffle=True, random_state=None)
for train_index, test_index in kFold.split(dfNARemove):
    train_data = dfNARemove.iloc[train_index]
    test_data = dfNARemove.iloc[test_index]

If you want to know, which non-continuous indices used for train_index and test_index on each fold, you can do following:

non_continuous_train_index = dfNARemove.index[train_index]
non_continuous_test_index = dfNARemove.index[test_index]

155

answered Oct 23 '22 22:10

Eduard Ilyasov

Related questions
                            
                                Convert API to Pandas DataFrame
                            
                                Why don't f-strings change when variables they reference change?
                            
                                Outer product of each column of a 2D array to form a 3D array - NumPy
                            
                                What do the functions tf.squeeze and tf.nn.rnn do?
                            
                                Environment specific pip.conf under anaconda
                            
                                Hiding and showing a widget in Kivy
                            
                                How do I have a "press enter to continue" feature in python? [duplicate]
                            
                                sqlalchemy print results instead of objects
                            
                                pip install mod_wsgi, How to Set MOD_WSGI_APACHE_ROOTDIR environment?
                            
                                ImportError: No module named googleapiclient.discovery
                            
                                How does paging work in the list_blobs function in Google Cloud Storage Python Client Library
                            
                                Is LASSO regression implemented in Statsmodels?
                            
                                Import CSV to database using sqlalchemy
                            
                                In method call args, how to override keyword argument of unpacked dict?
                            
                                mypy: how to define a generic subclass
                            
                                LSTM: Understand timesteps, samples and features and especially the use in reshape and input_shape
                            
                                Set values based on df.query?
                            
                                What is the necessity of sys.exit(app.exec_()) in PyQt?
                            
                                Bin elements per row - Vectorized 2D Bincount for NumPy
                            
                                Real-time audio signal processing using python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

sklearn kfold returning wrong indexes in python

Tags:

python

scikit-learn

HilaD

People also ask

1 Answers

Eduard Ilyasov

Recent Activity

Donate For Us