Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does GridSearchCV use predict or predict_proba, when using auc_score as score function?

Tags:

scikit-learn

Does GridSearchCV use predict or predict_proba, when using auc_score as score function?

The predict function generates predicted class labels, which will always result in a triangular ROC-curve. A more curved ROC-curve is obtained using the predicted class probabilities. The latter one is, as far as I know, more accurate. If so, the area under the 'curved' ROC-curve is probably best to measure classification performance within the grid search.

Therefore I am curious if either the class labels or class probabilities are used for the grid search, when using the area under the ROC-curve as performance measure. I tried to find the answer in the code, but could not figure it out. Does anyone here know the answer?

Thanks

like image 287
Bastiaan van den Berg Avatar asked Feb 19 '13 10:02

Bastiaan van den Berg


People also ask

Does GridSearchCV shuffle data?

However, GridSearchCV will use the same shuffling for each set of parameters validated by a single call to its fit method.

How does Sklearn GridSearchCV work?

GridSearchCV tries all the combinations of the values passed in the dictionary and evaluates the model for each combination using the Cross-Validation method. Hence after using this function we get accuracy/loss for every combination of hyperparameters and we can choose the one with the best performance.

What is Predict_proba in Sklearn?

The predict method is used to predict the actual class while predict_proba method can be used to infer the class probabilities (i.e. the probability that a particular data point falls into the underlying classes).

What does grid search CV do?

GridSearchCV is a technique to search through the best parameter values from the given set of the grid of parameters. It is basically a cross-validation method. the model and the parameters are required to be fed in. Best parameter values are extracted and then the predictions are made.

Does gridsearchcv use predict or predict_Proba when using AUC_score?

Does GridSearchCV use predict or predict_proba, when using auc_score as score function? The predict function generates predicted class labels, which will always result in a triangular ROC-curve. A more curved ROC-curve is obtained using the predicted class probabilities. The latter one is, as far as I know, more accurate.

How to set the scoring parameter in the gridsearchcv model?

We can also set the scoring parameter into the GridSearchCV model as a following. By default, it checks the R-squared metrics score. score = make_scorer (mean_squared_error) Fitting the model and getting the best estimator

What is the scoring metric in grid search?

The scoring metric can be any metric of your choice. However, just like the estimator object, the scoring metric should be chosen based on what type of problem the project is trying to solve. The other two parameters in the grid search is where the limitations come in to play.

How does it work in gridsearchcv?

It runs through all the different parameters that is fed into the parameter grid and produces the best combination of parameters, based on a scoring metric of your choice (accuracy, f1, etc). Obviously, nothing is perfect and GridSearchCV is no exception:


2 Answers

To use auc_score for grid searching you really need to use predict_proba or decision_function as you pointed out. This is not possible in the 0.13 release. If you do score_func=auc_score it will use predict which doesn't make any sense.

[edit]Since 0.14[/edit] it is possible to do grid-search using auc_score, by setting the new scoring parameter to roc_auc: GridSearch(est, param_grid, scoring='roc_auc'). It will do the right thing and use predict_proba (or decision_function if predict_proba is not available). See the whats new page of the current dev version.

You need to install the current master from github to get this functionality or wait until April (?) for 0.14.

like image 166
Andreas Mueller Avatar answered Oct 19 '22 02:10

Andreas Mueller


After performing some experiments with Sklearn SVC (which has predict_proba available) comparing some results with predict_proba and decision_function, it seems that roc_auc in GridSearchCV uses decision_function to compute AUC scores. I found a similar discussion here: Reproducing Sklearn SVC within GridSearchCV's roc_auc scores manually

like image 1
Augusto Peterlevitz Avatar answered Oct 19 '22 04:10

Augusto Peterlevitz