I'm new to SQL Server. I'm trying to figure out how I can get the below one done:
I have thousands of lat/long positions pointing to the same OR very close by locations. It's all stored flat in a SQL Server table as LAT & LONG columns.
Now to cluster the lat/longs and pick one representation per cluster, what I must be doing?
I read through a method called "STCentroid" : https://msdn.microsoft.com/en-us/library/bb933847.aspx
But is it worth letting the Server do a polygon with all these million rows and find the center point? Which would implicitly mean a single representation for all the near by duplicates. Might be an in efficient/wrong way?
Only points around few meters must be considered as duplicate entries. I'm thinking how I can pick the right representation.
In better words:
If there's a group of points G1{} (GPS positions) trying to point to a location L1. (Physical loc). & There's a group of points G2{}, trying to point to a location L2. How do I derive Center Point CP1 from G1{}. & CP2 from G2{}, such that CP1 is very close to L1 & CP2 is very close to L2.
And the fact is, L1 & L2 could be very near to each other say, 10 feet.
Just thinking how do I approach this problem. Any help please?
Clustering points will be problematic. You are going to have issues if you have two potential clusters close together, and if you need precision or optimization, then you will need to do some research on your implementation. Try: Wiki-Cluster Analysis
However, if the points clusers are fairly far apart, then you could try a fairly simple cluster and then find the envelopes.
Something like this may work, although you would be well served to actually make a spatial column and add a spatial index.
ALTER TABLE Recordset ADD (ClusterID INT) -- Add a grouping ID
GO
DECLARE @i INT --Group Counter
DECLARE @g GEOGRAPHY --Point from which the cluster will be made
DECLARE @Limit INT --Distance limitation
SET @Limit = 10
SET @i = 0
WHILE (SELECT COUNT(*) FROM Recordset R WHERE ClusterID IS NULL) > 0 --Loop until all points are clustered
BEGIN
SET @g = (SELECT TOP 1 GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326) WHERE ClusterID IS NULL) --Point to cluster on
UPDATE Recordset SET ClusterID = @i WHERE GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326).STDistance(@g) < @Limit AND ClusterID IS NULL--update all points within the limit circle
SET @i = @i + 1
END
SELECT --Clustered centers
ClusterID,
GEOGRAPHY::ConvexHullAggregate(GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326)).EnvelopeCenter().Lat AS 'LatCenter',
GEOGRAPHY::ConvexHullAggregate(GEOGRAPHY::STPointFromText('POINT(' + CAST(LAT AS VARCHAR(20)) + ' ' + CAST(LONG AS VARCHAR(20)) + ')', 4326)).EnvelopeCenter().Long AS 'LatCenter',
FROM
RecordSet
GROUP BY
ClusterID
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With