Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SKlearn: KDTree how to return nearest neighbour based on threshold (Python)

I have a database of 300 Images and I extracted for each of them a BOVW. Starting from a query image (with query_BOVW extracted from the same dictionary) I need to find similar images in my training dataset.

I used Sklearn KDTree on my training set kd_tree = KDTree(training) and then I calculate the distance from the query vector with kd_tree.query(query_vector). The last function takes as second parameter the number of nearest neighbours to return, but what I seek is to set a threshold for the euclidian distance and based on this threshold have different number of nearest neighbours.

I looked into the documentation but I did not find anything about that. Am I wrong seeking something that maybe does make no sense?

Thanks for the help.

like image 639
Furin Avatar asked Feb 07 '26 12:02

Furin


1 Answers

From the documentation, you can use the method query_radius:

Query for neighbors within a given radius:

import numpy as np
np.random.seed(0)
X = np.random.random((10, 3))  # 10 points in 3 dimensions
tree = KDTree(X, leaf_size=2)     
print(tree.query_radius(X[0], r=0.3, count_only=True))
ind = tree.query_radius(X[0], r=0.3) # indices of neighbors within distance 0.3 

This work with sklearn version 19.1

like image 64
Eolmar Avatar answered Feb 09 '26 01:02

Eolmar