Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

plot_decision_regions with error "Filler values must be provided when X has more than 2 training features."

I am plotting 2D plot for SVC Bernoulli output.

converted to vectors from Avg word2vec and standerdised data split data to train and test. Through grid search found the best C and gamma(rbf)

clf = SVC(C=100,gamma=0.0001)

clf.fit(X_train1,y_train)

from mlxtend.plotting import plot_decision_regions



plot_decision_regions(X_train, y_train, clf=clf, legend=2)


plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)

Receive error :- ValueError: y must be a NumPy array. Found

also tried to convert the y to numpy. Then it prompts error ValueError: y must be an integer array. Found object. Try passing the array as y.astype(np.integer)

finally i converted it to integer array. Now it is prompting of error. ValueError: Filler values must be provided when X has more than 2 training features.

like image 363
Ramakrishna B Avatar asked Oct 23 '18 15:10

Ramakrishna B


2 Answers

You can use PCA to reduce your data multi-dimensional data to two dimensional data. Then pass the obtained result in plot_decision_region and there will be no need of filler values.

from sklearn.decomposition import PCA
from mlxtend.plotting import plot_decision_regions

clf = SVC(C=100,gamma=0.0001)
pca = PCA(n_components = 2)
X_train2 = pca.fit_transform(X_train)
clf.fit(X_train2, y_train)
plot_decision_regions(X_train2, y_train, clf=clf, legend=2)

plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)
like image 125
Vardan Agarwal Avatar answered Oct 23 '22 14:10

Vardan Agarwal


I've spent some time with this too as plot_decision_regions was then complaining ValueError: Column(s) [2] need to be accounted for in either feature_index or filler_feature_values and there's one more parameter needed to avoid this.

So, say, you have 4 features and they come unnamed:

X_train_std.shape[1] = 4

We can refer to each feature by their index 0, 1, 2, 3. You only can plot 2 features at a time, say you want 0 and 2.

You'll need to specify one additional parameter (to those specified in @sos.cott's answer), feature_index, and fill the rest with fillers:

value=1.5
width=0.75

fig = plot_decision_regions(X_train.values, y_train.values, clf=clf,
              feature_index=[0,2],                        #these one will be plotted  
              filler_feature_values={1: value, 3:value},  #these will be ignored
              filler_feature_ranges={1: width, 3: width})
like image 2
alisa Avatar answered Oct 23 '22 13:10

alisa