Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scikit-learn: cross_val_predict only works for partitions

I am struggling to work out how to implement TimeSeriesSplit in sklearn.

The suggested answer at the link below yields the same ValueError.

sklearn TimeSeriesSplit cross_val_predict only works for partitions

here the relevant bit from my code:

from sklearn.model_selection import cross_val_predict
from sklearn import svm

features = df[df.columns[0:6]]
target = df['target']

clf = svm.SVC(random_state=0)

pred = cross_val_predict(clf, features, target, cv=TimeSeriesSplit(n_splits=5).split(features))

ValueError                                Traceback (most recent call last)
<ipython-input-57-d1393cd05640> in <module>()
----> 1 pred = cross_val_predict(clf, features, target, cv=TimeSeriesSplit(n_splits=5).split(features))

/home/jedwards/anaconda3/envs/py36/lib/python3.6/site-packages/sklearn/model_selection/_validation.py in cross_val_predict(estimator, X, y, groups, cv, n_jobs, verbose, fit_params, pre_dispatch, method)
    407 
    408     if not _check_is_permutation(test_indices, _num_samples(X)):
--> 409         raise ValueError('cross_val_predict only works for partitions')
    410 
    411     inv_test_indices = np.empty(len(test_indices), dtype=int)

ValueError: cross_val_predict only works for partitions
like image 1000
James Edwards Avatar asked Apr 07 '17 10:04

James Edwards


1 Answers

cross_val_predict cannot work with a TimeSeriesSplit as the first partition of the TimeSeriesSplit is never a part of the test dataset, meaning there are no predictions made for it.

e.g. when your dataset is [1, 2, 3, 4, 5]

  • fold 1 - train: [1], test: [2]
  • fold 2 - train: [1, 2], test: [3]
  • fold 3 - train: [1, 2, 3], test: [4]
  • fold 4 - train: [1, 2, 3, 4], test: [5]

in none of the folds is 1 in the test set

If you want to have the predictions on 2-5, you can manually loop through the splits generated by your CV and store the predictions for 2-5 yourself.

like image 111
Matthijs Brouns Avatar answered Nov 14 '22 22:11

Matthijs Brouns