Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

for Imbalanced data dealing with cat boost

Is there a parameter like "scale_pos_weight" in catboost package as we used to have in the xgboost package in python ?

like image 696
Harshit Mehta Avatar asked Aug 11 '17 21:08

Harshit Mehta


People also ask

Is XGBoost good for Imbalanced data?

Abstract. As a new and efficient ensemble learning algorithm, XGBoost has been widely applied for its multitudinous advantages, but its classification effect in the case of data imbalance is often not ideal.

Can boosting handle imbalanced data?

2. Boosting-Based techniques for imbalanced data. Boosting is an ensemble technique to combine weak learners to create a strong learner that can make accurate predictions. Boosting starts out with a base classifier / weak classifier that is prepared on the training data.

Is XGBoost sensitive to imbalanced data?

Although the algorithm performs well in general, even on imbalanced classification datasets, it offers a way to tune the training algorithm to pay more attention to misclassification of the minority class for datasets with a skewed class distribution.


2 Answers

Yes, the parameter is named "class_weights", you can find it here : Documentation

You have to pass a list like [0.8, 0.2] for binary or [0.3, 0.8, 0.4, 0.6] for multiclass of 4 for example. Doesn't have to sum to 1, it's used as a multiplier.

like image 105
Gaarv Avatar answered Oct 06 '22 01:10

Gaarv


CatBoost also has scale_pos_weight parameter starting from version 0.6.1

like image 40
Anna Veronika Dorogush Avatar answered Oct 06 '22 02:10

Anna Veronika Dorogush