Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Averaging out Lat/longs in SQL Server database

I'm new to SQL Server. I'm trying to figure out how I can get the below one done:

I have thousands of lat/long positions pointing to the same OR very close by locations. It's all stored flat in a SQL Server table as LAT & LONG columns.

Now to cluster the lat/longs and pick one representation per cluster, what I must be doing?

I read through a method called "STCentroid" : https://msdn.microsoft.com/en-us/library/bb933847.aspx

But is it worth letting the Server do a polygon with all these million rows and find the center point? Which would implicitly mean a single representation for all the near by duplicates. Might be an in efficient/wrong way?

Only points around few meters must be considered as duplicate entries. I'm thinking how I can pick the right representation.

In better words:

If there's a group of points G1{} (GPS positions) trying to point to a location L1. (Physical loc). & There's a group of points G2{}, trying to point to a location L2. How do I derive Center Point CP1 from G1{}. & CP2 from G2{}, such that CP1 is very close to L1 & CP2 is very close to L2.

And the fact is, L1 & L2 could be very near to each other say, 10 feet.

Just thinking how do I approach this problem. Any help please?

like image 624
user1534863 Avatar asked Aug 23 '15 06:08

user1534863


1 Answers

Clustering points will be problematic. You are going to have issues if you have two potential clusters close together, and if you need precision or optimization, then you will need to do some research on your implementation. Try: Wiki-Cluster Analysis

However, if the points clusers are fairly far apart, then you could try a fairly simple cluster and then find the envelopes.

Something like this may work, although you would be well served to actually make a spatial column and add a spatial index.

ALTER TABLE Recordset ADD (ClusterID INT) -- Add a grouping ID
GO
DECLARE @i INT --Group Counter
DECLARE @g GEOGRAPHY --Point from which the cluster will be made
DECLARE @Limit INT --Distance limitation
SET @Limit = 10

SET @i = 0
WHILE (SELECT COUNT(*) FROM Recordset R WHERE ClusterID IS NULL) > 0 --Loop until all points are clustered
BEGIN
  SET @g = (SELECT TOP 1 GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326) WHERE ClusterID IS NULL) --Point to cluster on
  UPDATE Recordset SET ClusterID = @i WHERE GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326).STDistance(@g) < @Limit AND ClusterID IS NULL--update all points within the limit circle

  SET @i = @i + 1
END

SELECT --Clustered centers
  ClusterID,
  GEOGRAPHY::ConvexHullAggregate(GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326)).EnvelopeCenter().Lat AS 'LatCenter',
  GEOGRAPHY::ConvexHullAggregate(GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326)).EnvelopeCenter().Long AS 'LatCenter',
FROM
  RecordSet
GROUP BY
  ClusterID
like image 95
hcaelxxam Avatar answered Oct 10 '22 00:10

hcaelxxam