What kind of data structure could be used for an efficient nearest neighbor search in a large set of geo coordinates? With "regular" spatial index structures like R-Trees that assume planar coordinates, I see two problems (Are there others I have overlooked?):
How can these factors be allowed for? I guess the second one could compensated by transforming the coordinates. Can an R-Tree be modified to take wraparound into account? Or are there specialized geo-spatial index structures?
The geospatial index supports distance, containment, and intersection queries for various geometric 2D shapes. You should mainly be using AQL queries to perform these types of operations. The index can operate in two different modes, depending on if you want to use the GeoJSON data-format or not.
Geospatial Indexes and Sharded Collections However, you can create a geospatial index on a sharded collection by using a different field as the shard key. The following geospatial operations are supported on sharded collections: $geoNear aggregation stage. $near and $nearSphere query operators (starting in MongoDB 4.0)
4.1 Creating a Spatial Index. Once data has been loaded into the spatial tables through either bulk or transactional loading, a spatial index must be created on the tables for efficient access to the data. Each spatial index can be an R-tree index or a quadtree index.
In MongoDB, you can store geospatial data as GeoJSON objects or as legacy coordinate pairs.
Could you use a locality-sensitive hashing (LSH) algorithm in 3 dimensions? That would quickly give you an approximate neighboring group which you could then sanity-check by calculating great-circle distances.
Here's a paper describing an algorithm for efficient LSH on the surface of a unit d-dimensional hypersphere. Presumably it works for d=3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With