Probability prediction method of KNeighborsClassifier returns only 0 and 1

Tags:

Can anyone tell me what's the problem with my code? Why I can predict probability of iris dataset by using LinearRegression but, KNeighborsClassifier gives me 0 or 1 while it should give me a result like the one LinearRegression yields?

from sklearn.datasets import load_iris
from sklearn import metrics

iris = load_iris()
X = iris.data
y = iris.target

for train_index, test_index in skf:
    X_train, X_test = X_total[train_index], X_total[test_index]
    y_train, y_test = y_total[train_index], y_total[test_index]

from sklearn.linear_model import LogisticRegression
ln = LogisticRegression()
ln.fit(X_train,y_train)

ln.predict_proba(X_test)[:,1]

array([ 0.18075722, 0.08906078, 0.14693156, 0.10467766, 0.14823032, 0.70361962, 0.65733216, 0.77864636, 0.67203114, 0.68655163, 0.25219798, 0.3863194 , 0.30735105, 0.13963637, 0.28017798])

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5, algorithm='ball_tree', metric='euclidean')
knn.fit(X_train, y_train)

knn.predict_proba(X_test)[0:10,1]

array([ 0., 0., 0., 0., 0., 1., 1., 1., 1., 1.])

777

asked May 07 '16 13:05

Kasra Babaei

1 Answers

Because KNN has very limited concept of probability. Its estimate is simply fraction of votes among nearest neighbours. Increase number of neighbours to 15 or 100 or query point near the decision boundary and you will see more diverse results. Currently your points are simply always having 5 neighbours of the same label (thus probability 0 or 1).

answered Jan 01 '23 17:01

lejlot

Related questions
                            
                                fastai learner requirements and batch prediction
                            
                                Keras: Making a neural network to find a number's modulus
                            
                                Keras Multitask learning with two different input sample size
                            
                                How to get the nearest neighbor in weka using java
                            
                                The relationship between latent Dirichlet allocation and documents clustering
                            
                                Can TF/IDF take classes in account
                            
                                Defining a gradient with respect to a subtensor in Theano
                            
                                Numpy linear regression with regularization
                            
                                How to interpret `scipy.stats.kstest` and `ks_2samp` to evaluate `fit` of data to a distribution?
                            
                                Are modern CNN (convolutional neural network) as DetectNet rotate invariant?
                            
                                Getting 'ValueError: shapes not aligned' on SciKit Linear Regression
                            
                                Tensorflow estimator: average_loss vs loss
                            
                                Trainable sklearn StandardScaler for R
                            
                                is it possible to implement dynamic class weights in keras?
                            
                                How Transformer is Bidirectional - Machine Learning
                            
                                How to load the saved tokenizer from pretrained model
                            
                                Implementing PCA with Numpy
                            
                                What is tape-based autograd in Pytorch?
                            
                                Compiling Caffe C++ Classification Example
                            
                                Keras: How to feed input directly into other hidden layers of the neural net than the first?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Probability prediction method of KNeighborsClassifier returns only 0 and 1

Tags:

machine-learning

probability

scikit-learn

nearest-neighbor

Kasra Babaei

People also ask

1 Answers

lejlot

Recent Activity

Donate For Us