As far as I know, multi-label problem can be solved with one-vs-all scheme, for which Scikit-learn implements OneVsRestClassifier
as a wrapper on classifier such as svm.SVC
. I am wondering how would it be different if I literally train, say we have a multi-label problem with n classes, n individual binary classifiers for each label and thereby evaluate them separately.
I know it is like a "manual" way of implementing one-vs-all rather than using the wrapper, but are two ways actually different? If so, how are they different, like in execution time or performance of classifier(s)?
Difference between multi-class classification & multi-label classification is that in multi-class problems the classes are mutually exclusive, whereas for multi-label problems each label represents a different classification task, but the tasks are somehow related.
Although the one-vs-rest approach cannot handle multiple datasets, it trains less number of classifiers, making it a faster option and often preferred. On the other hand, the one-vs-one approach is less prone to creating an imbalance in the dataset due to dominance in specific classes.
One-vs-rest (OvR for short, also referred to as One-vs-All or OvA) is a heuristic method for using binary classification algorithms for multi-class classification. It involves splitting the multi-class dataset into multiple binary classification problems.
In One-vs-One classification, for the N-class instances dataset, we have to generate the N* (N-1)/2 binary classifier models. Using this classification approach, we split the primary dataset into one dataset for each class opposite to every other class.
There would be no difference. For multi-label classification, sklearn one-versus-rest implements binary relevance which is what you have described.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With