Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

K Nearest-Neighbor Algorithm [closed]

Using the KNN-algorithm, say k=5. Now I try to classify an unknown object by getting its 5 nearest neighbours. What to do, if after determining the 4 nearest neighbors, the next 2 (or more) nearest objects have the same distance? Which object of these 2 or more should be chosen as the 5th nearest neighbor?

like image 764
Gwaihir Avatar asked Feb 03 '11 18:02

Gwaihir


People also ask

What are the difficulties with K nearest Neighbour algo?

Disadvantages of KNN Algorithm:Always needs to determine the value of K which may be complex some time. The computation cost is high because of calculating the distance between the data points for all the training samples.

What type of algorithm is k nearest neighbors?

Summary. The k-nearest neighbors (KNN) algorithm is a simple, supervised machine learning algorithm that can be used to solve both classification and regression problems.

How does k-nearest neighbors algorithm work?

KNN algorithms decide a number k which is the nearest Neighbor to that data point that is to be classified. If the value of k is 5 it will look for 5 nearest Neighbors to that data point. In this example, if we assume k=4. KNN finds out about the 4 nearest Neighbors.

What happens when K 1 in KNN?

An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.


1 Answers

Which object of these 2 or more should be chosen as the 5th nearest neighbor?

It really depends on how you want to implement it.

Most algorithms will do one of three things:

  1. Include all equal distance points, so for this estimation, they'll use 6 points, not 5.
  2. Use the "first" found point of the two equal distant.
  3. Pick a random (usually with a consistent seed, so results are reproducable) point from the 2 points found.

That being said, most algorithms based on radial searching have an inherent assumption of stationarity, in which case, it really shouldn't matter which of the options above you choose. In general, any of them should, theoretically, provide reasonable defaults (especially since they're the furthest points in the approximation, and should have the lowest effective weightings).

like image 130
Reed Copsey Avatar answered Sep 20 '22 18:09

Reed Copsey