I have n (about 10^5) points on a hypersphere of dimension m (between 10^4 to 10^6).
I am going to make a bunch of queries of the form "given a point p, find the closest of the n points to p". I'll make about n of these queries.
(Not sure if the hypersphere fact helps at all.)
The simple naive algorithm to solve this is, for each query, to compare p to all other n points. Doing this n times ends up with a runtime of O(n^2 m), which is far too big for me to be able to compute.
Is there a more efficient algorithm I can use? If I could get it to O(nm) with some log factors that'd be great.
Probably not. Having many dimensions makes efficient indexing extremely hard. That is why people look for opportunities to reduce the number of dimensions to something manageable.
See https://en.wikipedia.org/wiki/Curse_of_dimensionality and https://en.wikipedia.org/wiki/Dimensionality_reduction for more.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With