I'm trying to fit an SGDRegressor to my data and then check the accuracy. The fitting works fine, but then the predictions are not in the same datatype(?) as the original target data, and I get the error
ValueError: Can't handle mix of multiclass and continuous
When calling print "Accuracy:", ms.accuracy_score(y_test,predictions)
.
The data looks like this (just 200 thousand + rows):
Product_id/Date/product_group1/Price/Net price/Purchase price/Hour/Quantity/product_group2 0 107 12/31/2012 10 300 236 220 10 1 108
The code is as follows:
from sklearn.preprocessing import StandardScaler import numpy as np from sklearn.linear_model import SGDRegressor import numpy as np from sklearn import metrics as ms msk = np.random.rand(len(beers)) < 0.8 train = beers[msk] test = beers[~msk] X = train [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']] y = train[['Quantity']] y = y.as_matrix().ravel() X_test = test [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']] y_test = test[['Quantity']] y_test = y_test.as_matrix().ravel() clf = SGDRegressor(n_iter=2000) clf.fit(X, y) predictions = clf.predict(X_test) print "Accuracy:", ms.accuracy_score(y_test,predictions)
What should I do differently? Thank you!
Accuracy is a classification metric. You can't use it with a regression. See the documentation for info on the various metrics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With