Algorithm for 2D nearest-neighbour queries with dynamic points

Question

I am trying to find a fast algorithm for finding the (approximate, if need be) nearest neighbours of a given point in a two-dimensional space where points are frequently removed from the dataset and new points are added.

(Relatedly, there are two variants of this problem that interest me: one in which points can be thought of as being added and removed randomly and another in which all the points are in constant motion.)

Some thoughts:

kd-trees offer good performance, but are only suitable for static point sets
R*-trees seem to offer good performance for a variety of dimensions, but the generality of their design (arbitrary dimensions, general content geometries) suggests the possibility that a more specific algorithm might offer performance advantages
Algorithms with existing implementations are preferable (though this is not necessary)

What's a good choice here?

TilmannZ · Accepted Answer

I agree with (almost) everything that @gsamaras said, just to add a few things:

In my experience (using large dataset with >= 500,000 points), kNN-performance of KD-Trees is worse than pretty much any other spatial index by a factor of 10 to 100. I tested them (2 KD-trees and various other indexes) on a large OpenStreetMap dataset. In the following diagram, the KD-Trees are called KDL and KDS, the 2D dataset is called OSM-P (left diagram): The diagram is taken from this document, see bullet points below for more information.
This research describes an indexing method for moving objects, in case you keep (re-)inserting the same points in slightly different positions.
Quadtrees are not too bad either, they can be very fast in 2D, with excellent kNN performance for datasets < 1,000,000 entries.
If you are looking for Java implementations, have a look at my index library. In has implementations of quadtrees, R-star-tree, ph-tree, and others, all with a common API that also supports kNN. The library was written for the TinSpin, which is a framework for testing multidimensional indexes. Some results can be found enter link description here (it doesn't really describe the test data, but 'OSM-P' results are based on OpenStreetMap data with up to 50,000,000 2D points.
Depending on your scenario, you may also want to consider PH-Trees. They appear to be slower for kNN-queries than R-Trees in low dimensionality (though still faster than KD-Trees), but they are faster for removal and updates than RTrees. If you have a lot of removal/insertion, this may be a better choice (see the TinSpin results, Figures 2 and 46). C++ versions are available here and here.

Algorithm for 2D nearest-neighbour queries with dynamic points

Tags:

algorithm

data-structures

computational-geometry

2d

nearest-neighbor

Richard

1 Answers

TilmannZ

Recent Activity

Donate For Us

Algorithm for 2D nearest-neighbour queries with dynamic points

Tags:

algorithm

data-structures

computational-geometry

2d

nearest-neighbor

Richard

1 Answers

TilmannZ

Related questions

Recent Activity

Donate For Us