I have a set of points for which I want to construct KD Tree. After some time I want to add few more points to this KDTree periodically. Is there any way to do this in scipy implementation
Let's understand with an example by following the below steps: Import the required libraries using the below python code. Generate data points using the random generator as shown in the below code. Pass the points to kdtree and find all point pairings within a r distance in a kd-tree using the below code.
kd-tree for quick nearest-neighbor lookup. This class provides an index into a set of k-dimensional points which can be used to rapidly look up the nearest neighbors of any point.
Note that Octrees are not the same as k-d trees: k-d trees split along a dimension and octrees split around a point. Also k-d trees are always binary, which is not the case for octrees. By using a depth-first search the nodes are to be traversed and only required surfaces are to be viewed.
The problem with k-d-trees is that they are not designed for updates.
While you can somewhat easily insert objects (if you use a pointer based representation, which needs substantially more memory than an array-based tree), and do deletions with tricks such as tombstone messages, doing such changes will degrate the performance of the tree.
I am not aware of a good method for incrementally rebalancing a k-d-tree. For 1-dimensional trees you have red-black-trees, B-trees, B*-trees, B+-trees and such things. These don't obviously work with k-d-trees because of the rotating axes and thus different sorting. So in the end, with a k-d-tree, it may be best to just collect changes, and from time to time do a full tree rebuild. Then at least this part of the tree will be quite good.
However, there exists a similar structure (that in my experiments often outperforms the k-d-tree!): the R*-tree. Instead of performing binary splits, it uses rectangular bounding boxes to collect objects, and a lot of thought was put into making the tree a dynamic data structure. This is also where the R*-tree performs much better than the R-tree: it has a much more clever split for kNN search, and it performs incremental rebalancing to improve its structure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With