Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to measure the accuracy of knn classifier in python

I have used knn to classify my dataset. But I do not know how to measure the accuracy of the trained classifier. Does scikit have any inbuilt function to check accuracy of knn classifier?

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
knn.fit(training, train_label)    
predicted = knn.predict(testing)

Appreciate all the help. Thanks

like image 624
user1946217 Avatar asked Apr 04 '13 20:04

user1946217


3 Answers

You can use this code to getting started straight forward. It uses IRIS dataset. There are 3 classes available in iris dataset, Iris-Setosa, Iris-Virginica, and Iris-Versicolor.

Use this code. This gives me 97.78% accuracy

from sklearn import neighbors, datasets, preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix

iris = datasets.load_iris() 
X, y = iris.data[:, :], iris.target
Xtrain, Xtest, y_train, y_test = train_test_split(X, y, stratify = y, random_state = 0, train_size = 0.7)

scaler = preprocessing.StandardScaler().fit(Xtrain)
Xtrain = scaler.transform(Xtrain)
Xtest = scaler.transform(Xtest)

knn = neighbors.KNeighborsClassifier(n_neighbors=3)
knn.fit(Xtrain, y_train)
y_pred = knn.predict(Xtest)

print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
like image 28
Rheatey Bash Avatar answered Nov 12 '22 09:11

Rheatey Bash


Use sklearn.metrics.accuracy_score:

acc = accuracy_score(test_label, predicted)
like image 106
Fred Foo Avatar answered Nov 12 '22 07:11

Fred Foo


Another option is to calculate the confusion matrix, which tells you the accuracy of both classes and the alpha and beta errors:

from sklearn.metrics import confusion_matrix
con_mat = confusion_matrix(true_values, pred_values, [0, 1])

In case your labels are 0 and 1. If you want a nice output, you can add this code:

from numpy import np
import math
total_accuracy = (con_mat[0, 0] + con_mat[1, 1]) / float(np.sum(con_mat))
class1_accuracy = (con_mat[0, 0] / float(np.sum(con_mat[0, :])))
class2_accuracy = (con_mat[1, 1] / float(np.sum(con_mat[1, :])))
print(con_mat)
print('Total accuracy: %.5f' % total_accuracy)
print('Class1 accuracy: %.5f' % class1_accuracy)
print('Class2 accuracy: %.5f' % class2_accuracy)
print('Geometric mean accuracy: %.5f' % math.sqrt((class1_accuracy * class2_accuracy)))
like image 45
Noam Peled Avatar answered Nov 12 '22 09:11

Noam Peled