Is there a package or a simple way to serach k-nearest neighbor (specially with kd tree) for one point using R? All the packages who provide this function (example RANN or FNN...) compute the knn for all the points in a matrix, I need to do it for only one point.
For example I have a matrix with 10 points "A" to "E" and I want to find for "A" the 2 nearest neighbors between the 4 other points ("B" to "E") without doing the same calculation for all the rows in the dataset (without computing knn for "B", "C", "D", "E")
I hope my question is clear, my english is not good.
Thank you for help,
So the value of k indicates the number of training samples that are needed to classify the test sample. Coming to your question, the value of k is non-parametric and a general rule of thumb in choosing the value of k is k = sqrt(N)/2, where N stands for the number of samples in your training dataset.
KNN Algorithm Pseudocode: Calculate D(x, xi), where 'i' =1, 2, ….., n and 'D' is the Euclidean measure between the data points. The calculated Euclidean distances must be arranged in ascending order. Initialize k and take the first k distances from the sorted list.
An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.
In KNN, K is the number of nearest neighbors. The number of neighbors is the core deciding factor. K is generally an odd number if the number of classes is 2. When K=1, then the algorithm is known as the nearest neighbor algorithm.
If I understand correctly, you can do this with the FNN package:
> library(FNN)
> X <- matrix(runif(100), 5, 5)
> X
[,1] [,2] [,3] [,4] [,5]
[1,] 0.7475301 0.6725876 0.2511358 0.5048512 0.1196027
[2,] 0.5777907 0.6337206 0.8334608 0.5067914 0.6410024
[3,] 0.5488786 0.9613076 0.2217271 0.6906149 0.7396482
[4,] 0.8230380 0.8596784 0.6348114 0.6211107 0.3089131
[5,] 0.6531433 0.8682462 0.2555402 0.2443061 0.5292509
> knnx.dist(X[-1,], X[1, , drop=FALSE], k=2)
[,1] [,2]
[1,] 0.4870996 0.531889
> knnx.index(X[-1,], X[1, , drop=FALSE], k=2)
[,1] [,2]
[1,] 3 4
Note that the result of knnx.index relates to the matrix passed to the function so that 3, and 4 actually means rows 4 and 5 the original data set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With