Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Got continuous is not supported error in RandomForestRegressor

I'm just trying to do a simple RandomForestRegressor example. But while testing the accuracy I get this error

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc 

in accuracy_score(y_true, y_pred, normalize, sample_weight) 177 178 # Compute accuracy for each possible representation --> 179 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 180 if y_type.startswith('multilabel'): 181 differing_labels = count_nonzero(y_true - y_pred, axis=1)

/Users/noppanit/anaconda/lib/python2.7/site-packages/sklearn/metrics/classification.pyc 

in _check_targets(y_true, y_pred) 90 if (y_type not in ["binary", "multiclass", "multilabel-indicator", 91 "multilabel-sequences"]): ---> 92 raise ValueError("{0} is not supported".format(y_type)) 93 94 if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported 

This is the sample of the data. I can't show the real data.

target, func_1, func_2, func_2, ... func_200 float, float, float, float, ... float 

Here's my code.

import pandas as pd import numpy as np from sklearn.preprocessing import Imputer from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor, ExtraTreesRegressor, GradientBoostingRegressor from sklearn.cross_validation import train_test_split from sklearn.metrics import accuracy_score from sklearn import tree  train = pd.read_csv('data.txt', sep='\t')  labels = train.target train.drop('target', axis=1, inplace=True) cat = ['cat'] train_cat = pd.get_dummies(train[cat])  train.drop(train[cat], axis=1, inplace=True) train = np.hstack((train, train_cat))  imp = Imputer(missing_values='NaN', strategy='mean', axis=0) imp.fit(train) train = imp.transform(train)  x_train, x_test, y_train, y_test = train_test_split(train, labels.values, test_size = 0.2)  clf = RandomForestRegressor(n_estimators=10)  clf.fit(x_train, y_train) y_pred = clf.predict(x_test) accuracy_score(y_test, y_pred) # This is where I get the error. 
like image 915
toy Avatar asked Sep 19 '15 05:09

toy


2 Answers

It's because accuracy_score is for classification tasks only. For regression you should use something different, for example:

clf.score(X_test, y_test) 

Where X_test is samples, y_test is corresponding ground truth values. It will compute predictions inside.

like image 113
Ibraim Ganiev Avatar answered Sep 22 '22 22:09

Ibraim Ganiev


Since you are doing a classification task, you should be using the metric R-squared (co-effecient of determination) instead of accuracy score (accuracy score is used for classification problems).

R-squared can be computed by calling score function provided by RandomForestRegressor, for example:

rfr.score(X_test,Y_test) 
like image 36
ThReSholD Avatar answered Sep 24 '22 22:09

ThReSholD