This question concerns the implementation of KNN searching of KDTrees. Traversal of a KDTree to find a single best match (nearest neighbor) is straightforward, akin to a modified binary search.
How is the traversal modified to exhaustively and efficiently find k-best matches (KNN)?
Edit for clarification: After finding the nearest node M to the input query I, how does the traversal algorithm continue to find the remaining K-1 closest matches to the query? Is there a traversal pattern which guarantees that nodes are visited in order of best to worst match to the query?
Note that octrees are not the same as k-d trees: k-d trees split along a dimension and octrees split around a point. Also k-d trees are always binary, which is not the case for octrees. By using a depth-first search the nodes are to be traversed and only required surfaces are to be viewed.
2.3. A k-d tree, or k-dimensional tree, is a data structure used for organizing some number of points in a space with k dimensions. It is a binary search tree with other constraints imposed on it. K-d trees are very useful for range and nearest neighbour searches.
You can maintain a max heap of size k (k is the count of nearest neighbors which we wanted to find).
Start from the root node and insert the distance value in the max heap node. Keep on searching in k-d tree using dimensional splitting , criteria and keep updating Max Heap tree.
https://gopalcdas.wordpress.com/2017/05/24/construction-of-k-d-tree-and-using-it-for-nearest-neighbour-search/
~Ashish
Adding to @Ashish's answer, you can use a max-heap in the following manner:
1) Build a max-heap of the first k elements (arr[0] to arr[k-1]) of the given array.
This step is O(k). Then
2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with
root of the max-heap.
a) If the element is smaller than the root then make it root
and call heapify for max-heap.
b) Else ignore it.
The step 2 is O((n-k)*log(k)).
3) Finally, the max-heap has k smallest elements and root of the heap
is the kth smallest element.
Time Complexity: O(k + (n-k)*log(k)) without sorted output. If sorted output is needed then O(k + (n-k)*log(k) + k*log(k)).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With