I have a list of values (1-dimensional) and I would like to know the best data structure / algorithm for finding the nearest to a query value I have. Most of the solutions (all?) I found for questions here are for 2 or more dimensions. Can anybody suggest to me the approach for my case?
My instinct tells me to sort the data and use binary search somehow. By the way, there is no limit on the construction or insertion time for any tree needed, so probably someone can suggest a better tree than simply a sorted list.
Because, in high-dimensional spaces, the k-NN algorithm faces two difficulties: It becomes computationally more expensive to compute distance and find the nearest neighbors in high-dimensional space.
The k-nearest neighbors (KNN) algorithm is a simple, supervised machine learning algorithm that can be used to solve both classification and regression problems.
It can be used for both classification and regression problems. It's ideal for non-linear data since there's no assumption about underlying data. It can naturally handle multi-class cases.
VC dimension of kNN with k=1 is infinite.
If you need something faster than O(log(n)), which you can easily get with a sorted array or a binary search tree, you can use a van Emde Boas Tree. vEB trees give you O(log(log(n))) to search for the closest element on either side.
If insertion time is irrelevant, then binary search on a sorted array is the simplest way to achieve O(log N) query time. Each time an item is added sort everything. For each query, perform a binary search. If a match is found, return it. Otherwise, the binary search should return the index of the item, where it should have been inserted. Use this index to check the two neighboring items and determine which of them is closer to the query point.
I suppose that there are solutions with O(1) time. I will try to think of one that doesn't involve too much memory usage...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With