I have a set of vectors. For a vector in that set I like to find the sub set that is closeest to this vector. What algorithm can do this.
It is the minimization of the sum of squares of the components of →b−A→x, so that ||→b−A→x∗||≤||→b−A→x||,∀→x∈R4. This is precisely the vector "closest" to →b.
This can be done by using a quite simple algorithm, called the Lagrange-Gauss algorithm. This algorithm iteratively improves the shortness of the two vectors of our basis, until one of the basis vector becomes a shortest element of the lattice.
This class of algorithms is called Nearest Neighbor or K Nearest Neighbor.
The cosine similarity as excepeiont says will work if direction of vector is important. If the vector represents a position in a space, then any metric for representing a distance in the space will work.
For example the Euclidean distance: take the square root of the sum of squares difference in each dimension. This will give you a distance for each vector, then sort your set of vectors ascending on this distance.
This process will be O(N) in time. If this is too slow for you, you might want to look at some common K Nearest Neighbour algorithms.
use the cosinus similarity (http://en.wikipedia.org/wiki/Cosine_similarity) among the vectors and then sort them.
If your problem relates to large amount of data:
I published a related algorithm on ddj.com, that finds the nearest line to a given point:
Accelerated Search For the Nearest Line
You would have to modify this algorithm by i.e. by converting the given vector to a number of points. This will reduce the number of possible matches drastically. The exact match has then to be checked for each possible match by
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With