Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient implementation of the Nearest Neighbour Search

I am trying to implement an efficient algorithm for nearest-neighbour search problem.

I have read tutorials about some data structures, which support operations for this kind of problems (for example, R-tree, cover tree, etc.), but all of them are difficult to implement.

Also I cannot find sample source code for these data structures. I know C++ and I am trying to solve this problem in this language.

Ideally, I need links that describe how to implement these data structures using source code.

like image 525
dato datuashvili Avatar asked Apr 06 '12 08:04

dato datuashvili


People also ask

How do I find my nearest neighbor efficiently?

There are no known exact methods for finding nearest neighbors efficiently. As both the number of points increases and the number of dimensions increase, we fall victim to the curse of dimensionality. In high dimensions, all points are almost equally distant from each other.

What is nearest neighbor search explain with example?

All nearest neighbors As a simple example: when we find the distance from point X to point Y, that also tells us the distance from point Y to point X, so the same calculation can be reused in two different queries.

What is the basic idea of implementing k nearest neighbors method?

KNN works by finding the distances between a query and all the examples in the data, selecting the specified number examples (K) closest to the query, then votes for the most frequent label (in the case of classification) or averages the labels (in the case of regression).

What is the advantage of nearest neighbor method?

What are the advantages of KNN ? Can learn non-linear decision boundaries when used for classfication and regression. Can came up with a highly flexible decision boundary adjusting the value of K.


1 Answers

There are several good choices of fast nearest neighbor search libraries.

  • ANN, which is based on the work of Mount and Arya. This work is documented in a paper by S. Arya and D. M. Mount. "Approximate nearest neighbor queries in fixed dimensions". In Proc. 4th ACM-SIAM Sympos. Discrete Algorithms, pages 271–280, 1993.

  • FLANN, which is based on the work of Marius Muja & Co. There is a paper by Marius Muja and David G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration", in International Conference on Computer Vision Theory and Applications (VISAPP'09), 2009. The code for FLANN is available on github

FLANN seems to be quicker in some cases, and is a more modern version of the code with solid bindings for a number of other languages, that can incorporate changes rapidly. ANN is probably a good choice if you want a solid well-tested standard library.

Edit in Response to Comment

Both of these libraries have extensive documentation and examples.

Sample code for ANN is available in the Manual, In section 2.1.4

Sample code for FLANN is available in the FLANN repository examples directory, for example /examples/flann_examples.c

like image 169
Andrew Walker Avatar answered Nov 15 '22 02:11

Andrew Walker