Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set own scoring with GridSearchCV from sklearn for regression?

I used to use GridSearchCV(...scoring="accuracy"...) for classification model. and now I am about to use GridSearchCV for the regression model and set scoring with own error function.

Example code:

def rmse(predict, actual):
    predict = np.array(predict)
    actual = np.array(actual)

    distance = predict - actual

    square_distance = distance ** 2

    mean_square_distance = square_distance.mean()

    score = np.sqrt(mean_square_distance)

    return score

rmse_score = make_scorer(rmse)

gsSVR = GridSearchCV(...scoring=rmse_score...)
gsSVR.fit(X_train,Y_train)
SVR_best = gsSVR.best_estimator_
print(gsSVR.best_score_)

However, I found it this way return parameter set when the error score is the highest. as a result, I got the worst parameter set and score. In this case, how can I get the best estimator and score?

summary:

classification -> GridSearchCV(scoring="accuracy") -> best_estimaror...best

regression -> GridSearchCV(scroing=rmse_score) -> best_estimator...worst

like image 714
will Park Avatar asked Jan 27 '23 02:01

will Park


1 Answers

That is a technically a loss where lower is better. You can turn that option on in make_scorer:

greater_is_better : boolean, default=True Whether score_func is a score function (default), meaning high is good, or a loss function, meaning low is good. In the latter case, the scorer object will sign-flip the outcome of the score_func.

You also need to change the order of inputs from rmse(predict, actual) to rmse(actual, predict) because thats the order GridSearchCV will pass them. So the final scorer will look like this:

def rmse(actual, predict):

    ...
    ...
    return score

rmse_score = make_scorer(rmse, greater_is_better = False)
like image 189
Vivek Kumar Avatar answered Feb 02 '23 08:02

Vivek Kumar