I'm using surprise to perform a cross validation
def cross_v(data, folds=5):
algorithms = (SVD, KNNBasic, KNNWithMeans, NormalPredictor)
measures = ['RMSE', 'MAE']
for a in algorithms:
data.split(folds);
algo = a();
algo.fit(data)
I call the function this way
data = Dataset.load_builtin('ml-100k')
multiple_cv(data)
and I get this error
Traceback (most recent call last):
File "/home/user/PycharmProjects/pac1/prueba.py", line 30, in <module>
multiple_cv(data)
File "/home/user/PycharmProjects/pac1/prueba.py", line 19, in multiple_cv
algo.fit(data)
File "surprise/prediction_algorithms/matrix_factorization.pyx", line 155, in surprise.prediction_algorithms.matrix_factorization.SVD.fit
File "surprise/prediction_algorithms/matrix_factorization.pyx", line 204, in surprise.prediction_algorithms.matrix_factorization.SVD.sgd
AttributeError: 'DatasetAutoFolds' object has no attribute 'global_mean'
I missed something??
As per the docs, the input to the fit method must be a Trainset, which is different from a Dataset, that you are trying to use. You can split a Dataset to a Trainset (and Testset) using the output of the split method as mentioned here.
In your example,
data = Dataset.load_builtin('ml-100k')
trainset = data.build_full_trainset()
Then, you can use
algo.fit(trainset)
The Trainset and the Testset thus obtained can be used as the inputs for fit and test functions respectively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With