Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sklearn - cross validation with precision scoring for a subset of classes

I have a dataset for classification with 3 class labels [0,1,2].

I want to run cross validation and try several estimators, but I am interested in scoring with precision of only classes 1 and 2. I don't care about the precision of class 0, and I don't want its scoring to throw off the CV optimization. I also don't care about the recall of any of the classes. In other words, I want to make sure that whenever 1 or 2 are predicted, it's with very high confidence.

So the question is, how do I run cross_val_score and tell its scoring function to disregard precision of class 0?

UPDATE: Here's an example answer code, according to the accepted answer:

def custom_precision_score(y_true,y_pred):
  precision_tuple, recall_tuple, fscore_tuple, support_tuple = metrics.precision_recall_fscore_support(y_true, y_pred)
  precision_tuple = precision_tuple[1:]
  support_tuple = support_tuple[1:]
  weighted_precision = np.average(precision_tuple, weights=support_tuple)
  return weighted_precision

custom_scorer = metrics.make_scorer(custom_precision_score)

scores = cross_validation.cross_val_score(clf, featuresArray, targetArray, cv=10, scoring=custom_scorer)
like image 694
Oren Solomianik Avatar asked Dec 25 '13 12:12

Oren Solomianik


1 Answers

cross_val_score includes a scorer callable object that can be set with your own test strategy using make_scorer. And you can set your the groups that you are going to test in self-defined score function score_func(y, y_pred, **kwargs), which is called by make_scorer.

like image 171
lennon310 Avatar answered Oct 22 '22 14:10

lennon310