I have a dataset and I want to train my model on that data. After training, I need to know the features that are major contributors in the classification for a SVM classifier.
There is something called feature importance for forest algorithms, is there anything similar?
Yes, there is attribute coef_
for SVM classifier but it only works for SVM with linear kernel. For other kernels it is not possible because data are transformed by kernel method to another space, which is not related to input space, check the explanation.
from matplotlib import pyplot as plt from sklearn import svm def f_importances(coef, names): imp = coef imp,names = zip(*sorted(zip(imp,names))) plt.barh(range(len(names)), imp, align='center') plt.yticks(range(len(names)), names) plt.show() features_names = ['input1', 'input2'] svm = svm.SVC(kernel='linear') svm.fit(X, Y) f_importances(svm.coef_, features_names)
And the output of the function looks like this:
In only one line of code:
fit an SVM model:
from sklearn import svm svm = svm.SVC(gamma=0.001, C=100., kernel = 'linear')
and implement the plot as follows:
pd.Series(abs(svm.coef_[0]), index=features.columns).nlargest(10).plot(kind='barh')
The resuit will be:
the most contributing features of the SVM model in absolute values
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With