Can someone please tell how to use ensembles in sklearn using partial fit. I don't want to retrain my model. Alternatively, can we pass pre-trained models for ensembling ? I have seen that voting classifier for example does not support training using partial fit.
The Mlxtend library has an implementation of VotingEnsemble which allows you to pass in pre-fitted models. For example if you have three pre-trained models clf1, clf2, clf3. The following code would work.
from mlxtend.classifier import EnsembleVoteClassifier
import copy
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[1,1,1], fit_base_estimators=False)
When set to false the fit_base_estimators argument in EnsembleVoteClassifier ensures that the classifiers are not refit.
In general, when looking for more advanced technical features that sci-kit learn does not provide, look to mlxtend as a first point of reference.
Workaround:
VotingClassifier checks that estimators_ is set in order to understand whether it is fitted, and is using the estimators in estimators_ list for prediction. If you have pre trained classifiers, you can put them in estimators_ directly like the code below.
However, it is also using LabelEnconder, so it assumes labels are like 0,1,2,... and you also need to set le_ and classes_ (see below).
from sklearn.ensemble import VotingClassifier
from sklearn.preprocessing import LabelEncoder
clf_list = [clf1, clf2, clf3]
eclf = VotingClassifier(estimators = [('1' ,clf1), ('2', clf2), ('3', clf3)], voting='soft')
eclf.estimators_ = clf_list
eclf.le_ = LabelEncoder().fit(y)
eclf.classes_ = seclf.le_.classes_
# Now it will work without calling fit
eclf.predict(X,y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With