I'm resampling my data (multiclass) by using SMOTE.
sm = SMOTE(random_state=1)
X_res, Y_res = sm.fit_resample(X_train, Y_train)
However, I'm getting this attribute error. Can anyone help?
Short answer
You need to upgrade scikit-learn
to version 0.23.1.
Long answer
The newest version 0.7.0 of imbalanced-learn
seems to have an undocumented dependency on scikit-learn
v0.23.1. It would give you AttributeError: 'SMOTE' object has no attribute '_validate_data'
if your scikit-learn
is 0.22 or below.
If you are using Anaconda
, installing scikit-learn
version 0.23.1 might be tricky. conda update scikit-learn
might not update scikit-learn
version 0.23 or higher because the newest scikit-learn
version Conda has at this point of time is 0.22.1. If you try to install it using conda install scikit-learn=0.23.1
or pip install scikit-learn==0.23.1
, you will get tons of compatibility checks and installation might not be quick. Therefore the easiest way to install scikit-learn
version 0.23.1 in Anaconda is to create a new virtual environment with minimum packages so that there are less or no conflict issues. Then, in the new virtual environment install scikit-learn
version 0.23.1 followed by version 0.7.0 of imbalanced-learn
.
conda create -n test python=3.7.6
conda activate test
pip install scikit-learn==0.23.1
pip install imbalanced-learn==0.7.0
Finally, you need to reinstall your IDE in the new virtual environment in order to use these packages.
However, once scikit-learn
version 0.23.1 becomes available in Conda and there are no compatibility issues, you can install it in the base environment directly.
Step 1- Open your jupyter notebook
Step 2 - type pip install --upgrade scikit-learn
Step 3 - Restart the kernel
Follow all the steps as it is and it's done!!(upgraded)
Welcome to SO! For your next question like this, you'll probably want to include the versions of python, sklearn, and imblearn you are using.
I ran into this same problem myself and the developers have noticed it: https://github.com/scikit-learn-contrib/imbalanced-learn/issues/727
Might want to follow this page to see if a solution is posted in the next few days. It seems to be about the sklearn library not being cleaned up properly after installing imblearn.
UPDATE
This can be fixed by updating your sklearn to Version 0.23 or higher. This should be possible for you through either:pip update scikit-learn
ORconda update scikit-learn
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With