I am using Pycharm to run a Machine learning interface code. The SVM algorithm keeps crashing my interface with the following error:
line 1220, in pushButton_8_handlerax1 = sns.distplot(Y_predict)
line 979, in invraise LinAlgError("singular matrix")
numpy.linalg.LinAlgError: singular matrix
The code is below.
from sklearn import model_selection
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
from sklearn.preprocessing import LabelEncoder
from sklearn import tree
from sklearn.metrics import accuracy_score
import pickle
from PIL import Image
import numpy as np
import seaborn as sns
def plot_distribution(inp):
plt.figure()
ax = sns.distplot(inp)
plt.axvline(np.mean(inp), color="k", linestyle="dashed", linewidth=5)
_, max_ = plt.ylim()
plt.text(
inp.mean() + inp.mean() / 10,
max_ - max_ / 10,
"Mean: {:.2f}".format(inp.mean()),
)
return plt.figure
print(np.mean(Y_test))
print(np.mean(Y_predict))
# ________________________Dist Predict Vs. Test and Means value_________________________
plt.figure(figsize=(9, 5))
o = np.mean(Y_test)
tt = np.mean(Y_train)
ax1 = sns.distplot(Y_predict)
ax2 = sns.distplot(Y_test)
RR1 = ('Mean:', tt)
RR2 = ('Mean:', o)
plt.axvline(np.mean(Y_predict), color='b', linestyle='dashed', linewidth=5, label=RR1)
plt.axvline(np.mean(Y_test), color='orange', linestyle='dashed', linewidth=5, label=RR2)
plt.legend()
plt.savefig('DecisionTreeClassifier2.png')
basically, the error is at ax1 = sns.distplot(Y_predict).
I hope I was able to explain the problem.
Thanks
Discussion for a similar error can be found here:https://github.com/mwaskom/seaborn/issues/1502 however, in your case it seems that you are using an old version of seaborn.
In the latest versions of seaborn, this error does not appear (ref: https://github.com/mwaskom/seaborn/pull/1823).
Explanation of the error
You get this error because distplot function has as default input argument kde=True and this leads to the observed error. The Y_predict seems to be discrete with large bin sizes. The average k-nearest distance is then 0 (for not too large k), which then screws over the kernel width estimation of the KDE.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With