I am using a Random Forest classifer in scikit learn with an imbalanced data set of two classes. I am much more worried about false negatives than false positives. Is it possible to fix the false negative rate (to, say, 1%) and ask scikit to optimize the false positive rate somehow?
If this classifier doesn't support it, is there another classifier that does?
To minimize the number of False Negatives (FN) or False Positives (FP) we can also retrain a model on the same data with slightly different output values more specific to its previous results. This method involves taking a model and training it on a dataset until it optimally reaches a global minimum.
Another common method used to decrease cases like false negatives or false positives is changing the decision boundary line. The basic decision boundary line in binary classification models is 0.5. When the y value is greater than 0.5, the prediction is considered True.
A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is an outcome where the model incorrectly predicts the negative class. In the following sections, we'll look at how to evaluate classification models using metrics derived from these four outcomes.
I believe the problem of class imbalance in sklearn can be partially resolved by using the class_weight
parameter.
this parameter is either a dictionary, where each class is assigned a uniform weight, or is a string that tells sklearn how to build this dictionary. For instance, setting this parameter to 'auto', will weight each class in proportion of the inverse of its frequency.
By weighting the class that is less present with a higher amount, you can end up with 'better' results.
Classifier like like SVM or logistic regression also offer this class_weight
parameter.
This Stack Overflow answer gives some other ideas on how to handle class imbalance, like under sampling and oversampling.
I found this article on class imbalance problem.
http://www.chioka.in/class-imbalance-problem/
It has basically discussed the following possible solutions to summarize:
Hope It may help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With