Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do we set ratio in SMOTE to have more positive sample than negative sample?

I am trying to use SMOTE to handle imbalanced class data in binary classification, and what I know is: if we use, for example

sm = SMOTE(ratio = 1.0, random_state=10)

Before OverSampling, counts of label '1': [78]
Before OverSampling, counts of label '0': [6266] 

After OverSampling, counts of label '1': 6266
After OverSampling, counts of label '0': 6266

for case where class 1 is minority, it will result in 50:50 number of class 0 and 1

and

sm = SMOTE(ratio = 0.5, random_state=10)

Before OverSampling, counts of label '1': [78]
Before OverSampling, counts of label '0': [6266] 

After OverSampling, counts of label '1': 3133
After OverSampling, counts of label '0': 6266

will result class 1 to be halved size of class 0.

My question:

how do we set the ratio to obtain more class 1 than class 0, for instance 75:25?

like image 366
npm Avatar asked Sep 01 '25 09:09

npm


1 Answers

Try using a dictionary.

smote_on_1 = 18798 
#(In your case 18798 is thrice of 6266)

smt = SMOTE(sampling_strategy={1: smote_on_1})
X_train, y_train = smt.fit_sample(X_train, y_train)
like image 165
Prateek sahu Avatar answered Sep 03 '25 21:09

Prateek sahu