Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I do efficient range searching + counting with latitude/longitude data?

I'm working with a large set of points represented by latitude/longitude pairs (the points are not necessarily unique, there could be several points in the set that are at the same location). The points are stored in a database.

What I need to do is figure out a way to efficiently perform a search to get the number of points that lie within a given radius (say, 25 miles) of an arbitrary point. The count does not need to be 100% accurate - more importantly, it just has to be fast, and reasonably close to the correct count. Doing this with SQL is possible, by using a query with some trigonometry in the WHERE clause to filter points by their distance to a reference point. Unfortunately, this query is very, very expensive and caching is not likely to provide much help because the locations will be very spread out.

I'm ultimately looking to build some kind of in-memory structure that will be able to handle this kind of operation efficiently - trading off some of the accuracy and liveness of the data (maybe rebuilding it only once a day) in return for speed. I've been doing some research on kd-trees, but i'm not clear yet on how well this can be applied to latitude/longitude data (as opposed to x,y data in a 2d plane).

If anybody has any ideas or solutions I should look into, I'd really appreciate it - so thanks in advance.

like image 498
mpobrien Avatar asked Feb 05 '09 16:02

mpobrien


2 Answers

I don't think you should use this solution. Having randomly thought about it a few days ago, I think that measuring the distance from a specific point the grid squares' locations will be based on circles rather than a uniformed grid. The further away from 0,0 you are the less accurate this will be!

What I did was to have 2 additional values on my PostalCode class. Whenever I update the Long/Lat on a PostalCode I calculate an X,Y distance from Long 0, Lat 0.

public static class MathExtender
{
    public static double GetDistanceBetweenPoints(double sourceLatitude, double sourceLongitude, double destLatitude, double destLongitude)
    {
        double theta = sourceLongitude - destLongitude;
        double distance =
            Math.Sin(DegToRad(sourceLatitude))
            * Math.Sin(DegToRad(destLatitude))
            + Math.Cos(DegToRad(sourceLatitude))
            * Math.Cos(DegToRad(destLatitude))
            * Math.Cos(DegToRad(theta));
        distance = Math.Acos(distance);
        distance = RadToDeg(distance);
        distance = distance * 60 * 1.1515;
        return (distance);
    }


    public static double DegToRad(double degrees)
    {
        return (degrees * Math.PI / 180.0);
    }

    public static double RadToDeg(double radians)
    {
        return (radians / Math.PI * 180.0);
    }
}

Then I update my class like so:

private void CalculateGridReference()
{
    GridReferenceX = MathExtender.GetDistanceBetweenPoints(0, 0, 0, Longitude);
    GridReferenceY = MathExtender.GetDistanceBetweenPoints(0, 0, Latitude, 0);
}

So now I have an x,y grid distance (in miles) from grid reference 0,0 for each row in my DB. If I want to find all places with 5 miles of a long/lat I would first get the X,Y grid reference (say 25,75) then I would search 20..30, 70..80 in the DB, and further filter the results in memory using

MathExtensder.GetDistanceBetweenPoints(candidate.Lat, candidate.Long, search.Lat, search.Long) < TheRadiusOfInterest

The in DB part is ultra fast, and the in-memory part works on a smaller set to make it ultra accurate.

like image 63
Peter Morris Avatar answered Oct 13 '22 14:10

Peter Morris


Use R-Trees.

In Oracle, using Oracle Spatial, you can create an index:

CREATE INDEX ix_spatial ON spatial_table (locations) INDEXTYPE IS MDSYS.SPATIAL_INDEX;

that will create an R-Tree for you and search over it.

You may use any Earth Model you like: WGS84, PZ-90 etc.

like image 22
Quassnoi Avatar answered Oct 13 '22 15:10

Quassnoi