Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserve start points in UnionAggregate

Use case 1:

DECLARE @Geom TABLE 
( 
   shape geometry, 
   shapeType nvarchar(50) 
); 

INSERT INTO @Geom(shape,shapeType) 
VALUES('LINESTRING(1 2, 3 4)', 'A'), 
('LINESTRING(3.2 4, 7 8)', 'B'); 

SELECT *
FROM @Geom

SELECT geometry::UnionAggregate(shape).ToString(), geometry::UnionAggregate(shape)
FROM @Geom;

The WKT for the output is

MULTILINESTRING ((7 8, 3.2 4), (3 4, 1 2))

when I would want

MULTILINESTRING ((1 2, 3 4), (3.2 4, 7 8))

Where the beginning of the "A" and "B" line should be (1 2) and (3.2 4) respectfully.

This behavior of UnionAggregate doesn't seem to care about "direction" of the geometry in order to maintain that A union B and B union A is the same result. However, I want to preserve start/endpoints as I am unioning street geometry and I want all the LINESTRINGs to go in their original direction.

This problem is discussed here: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/89e95366-3649-4294-a0bc-f3921598157f/union-of-linestrings-and-reversing-direction?forum=sqlspatial

They seem to suggest at a possible solution about checking the end result, but it is not clear to me how to do that. It is hinted at in a linked thread that

The MultiLineString always represents the graph from the point which farthest from the origin point.

It is not clear to me exactly what this means, but I don't think I can just assume the result of a UnionAggregate is always the reverse of what I want

If it is hard to know directional intent then I can add M measures where the direction should follow increasing M values.

Assuming I have a method for reversing the points in line, how would I go about solving for this?

I found a function that mimics for STUnion for added support for Z and M measure: http://www.spatialdbadvisor.com/files/SQLServer.html#robo48 however it is noted that "their direction could change (eg Start/Start Point relationship).", which is what I want to avoid.

Edit:

The functionality I also need is that when to LINESTRING have a shared endpoint, the result is a connect LINESTRING

Use case 2:

DECLARE @Geom TABLE 
( 
   shape geometry, 
   shapeType nvarchar(50) 
); 

INSERT INTO @Geom(shape,shapeType) 
VALUES('LINESTRING(1 2, 3 4)', 'A'), 
('LINESTRING(3 4, 7 8)', 'B'); 

SELECT *
FROM @Geom

SELECT geometry::UnionAggregate(shape).ToString(), geometry::UnionAggregate(shape)
FROM @Geom;

This results in WKT LINESTRING (7 8, 3 4, 1 2)

When I would want

LINESTRING (1 2, 3 4, 7 8)


Attempt at solution

The geometry::CollectionAggregate(shape).Reduce(0) as suggested by Clay solves use case 1. I tried just using STUnion on the result with an empty linestring and while it works it falls back to the incorrect ordering.

I suspect the solution will be a scaler function similar to ST_LineMerge which takes the result of the CollectionAggregate (MULTILINESTRING) and then merges the points together when it can into one LINESTRING, and when it can't returns the geometry back unaltered

like image 592
ParoX Avatar asked Feb 18 '19 04:02

ParoX


3 Answers

The geometry types don't record/encode directionality. The lines that you give it may be considered "undirected" or "bi-directional". This returns 1:

select geometry::STGeomFromText('LINESTRING(1 2, 3 4)',0).STEquals(
       geometry::STGeomFromText('LINESTRING(3 4, 1 2)',0))

So what you're looking for isn't available using these types. You consider the "start points" to be special. I suggest you separately record those as individual POINTs.

This does make all of the resulting code uglier now though - you have to keep these data pairs processed together:

DECLARE @Geom TABLE 
(
   start geometry, 
   shape geometry, 
   shapeType nvarchar(50) 
); 

INSERT INTO @Geom(start,shape,shapeType) 
VALUES('POINT(1 2)','LINESTRING(1 2, 3 4)', 'A'), 
('POINT(3.2 4)','LINESTRING(3.2 4, 7 8)', 'B'); 

SELECT *
FROM @Geom

SELECT
    geometry::UnionAggregate(start).ToString(), geometry::UnionAggregate(shape).ToString(),
    geometry::UnionAggregate(start), geometry::UnionAggregate(shape)
FROM @Geom;

At this point you may decide to stop using the geography type directly - you can create a CLR UDT that references SqlGeography (a CLR surfacing of the same type) and uses that internally but also tracks it's "directionality" too, all wrapped up together, and start using that instead.

You're unlikely to want to surface all of the geography methods in that wrapper though - you'll have to pick and choose your battles. And, of course, since it's not really the SQL Server geography turning up in your results, you won't get the benefit of the "Spatial Results" tab in Management Studio.


The only place I can think of where some "directionality" does exist in these types is the left-hand rule for disambiguating geography shapes.

like image 115
Damien_The_Unbeliever Avatar answered Nov 16 '22 04:11

Damien_The_Unbeliever


Originally, I suggested...

DECLARE @Geom TABLE 
( 
   shape geometry, 
   shapeType nvarchar(50) 
); 

INSERT @Geom(shape,shapeType) VALUES
  ('LINESTRING(1 2, 3 4)', 'A'), 
  ('LINESTRING(3.2 4, 7 8)', 'B'); 

SELECT * FROM @Geom

SELECT 
  geometry::CollectionAggregate(shape).Reduce(0).ToString(), 
  geometry::CollectionAggregate(shape).Reduce(0)
FROM @Geom

You get:

enter image description here

...however, it became clear to me that the answer I gave isn't quite good enough. For example, it's kinda hard to keep Reduce() from simplifying away part of your lines,

I still like the CollectionAggregate for getting your original array of lines into a single thing, but then I figured there just has to be a way of building the requisite geometry structure.

I played with this several times, and this iteration will eval to a LineString or a MultiLineString depending on whether there are disjoint LineString elements in the inputs:

create function dbo.SimplifyToLine( @geo geometry ) returns geometry as
begin
  declare 
    @numSubGeos int = @geo.STNumGeometries(),
    @subGeoIdx int = 1,
    @sql nvarchar( max ) = N'',
    @subGeo geometry,
    @oldEndX float = -1.0e26,
    @oldEndY float = -1.0e26,
    @startX float,
    @startY float,
    @endX float,
    @endY float,
    @idx int,
    @numPoints int,
    @point geometry,
    @segment int = 1,
    @continue bit,
    @result geometry,
    @started bit = 0

  declare
    @geos table
    ( 
      Idx int primary key, 
      SubGeo geometry, 
      StartX decimal, 
      EndX decimal, 
      StartY decimal, 
      EndY decimal, 
      NumPoints int, 
      ContinueFromPrevious bit 
    ) 

  declare
    @multiLines table
    (
      Idx int primary key,
      Segment nvarchar(max)
    )

  --> collect geometries and extents...
  while ( @subGeoIdx <= @numSubGeos )  
  begin

    select @subGeo = @geo.STGeometryN( @subGeoIdx )

    select 
      @startX = @subGeo.STPointN( 1 ).STX,
      @startY = @subGeo.STPointN( 1 ).STY,
      @endX = @subGeo.STPointN( @subGeo.STNumPoints( ) ).STX,
      @endY = @subGeo.STPointN( @subGeo.STNumPoints( ) ).STY

    insert @geos values
    ( 
      @subGeoIdx,
      @subGeo, 
      @startX, 
      @endX, 
      @startY, 
      @endY, 
      @subGeo.STNumPoints() ,
      case when @subGeoIdx = 1 then 1 when @oldEndX = @startX and @oldEndY = @startY then 1 else 0 end
    )   

    select 
      @oldEndX = @endX, 
      @oldEndY = @endY, 
      @subGeoIdx = @subGeoIdx + 1
  end


  if not exists ( select * from @geos where ContinueFromPrevious = 0 ) --> then all LineStrings are connected 
  begin
    --> build a single LINESTRING( )...
    select @sql = ''
    declare c cursor for select SubGeo, StartX, EndX, StartY, EndY, NumPoints, ContinueFromPrevious from @geos order by Idx  
    open c
    while ( 1 = 1 )
    begin
      fetch next from c into @subGeo, @startX, @endX, @startY, @endY, @numPoints, @continue
      if @@fetch_status != 0 break;
      select @idx = case when @started = 0 then 1 else 2 end, @started = 1  --> accrue all points, de-duplicating line ends...
      while ( @idx <= @numPoints )
      begin
        select @point = @subGeo.STPointN( @idx )
        select @sql += convert( nvarchar, @point.STX ) + N' ' + convert( nvarchar, @point.STY ) + N','
        select @idx = @idx + 1
      end
    end
    close c
    deallocate c
    select @sql = substring( @sql, 1, len( @sql ) -1 )
    select @result =  geometry::STGeomFromText(N'LINESTRING(' + @sql + N')', 0 ) 
  end
  else  --> we have disjoint lines in the inputs...
  begin
    select @sql = N'', @started = 0
    --> build a MULTILINESTRING((),()...) with line segements terminated at disjoint points..
    declare c cursor for select SubGeo, StartX, EndX, StartY, EndY, NumPoints, ContinueFromPrevious from @geos  order by Idx
    open c
      while ( 1=1 )
      begin
        fetch next from c into @subGeo, @startX, @endX, @startY, @endY, @numPoints, @continue
        if @@fetch_status != 0 break;
        if @continue = 1
        begin
          select @idx = case when @started = 0 then 1 else 2 end, @started = 1
          while ( @idx <= @numPoints )
          begin
            select @point = @subGeo.STPointN( @idx )
            select @sql += convert( nvarchar, @point.STX ) + N' ' + convert( nvarchar, @point.STY ) + N','
            select @idx = @idx + 1
          end
        end
        else
        begin
          insert @multiLines values ( @segment, substring( @sql, 1, len( @sql ) -1 ) ) --> collect the segment
          select @idx = 1, @sql = N'', @segment = @segment + 1
          while ( @idx <= @numPoints )
          begin
            select @point = @subGeo.STPointN( @idx )
            select @sql += convert( nvarchar, @point.STX ) + N' ' + convert( nvarchar, @point.STY ) + N','
            select @idx = @idx + 1
          end
        end
      end
    close c
    deallocate c
    insert @multiLines values ( @segment, substring( @sql, 1, len( @sql ) -1 ) )
    select @sql = N''
    select @sql += N'(' + Segment + N'),' from @multiLines order by Idx --> appends all segments
    select @sql = substring( @sql, 1, len( @sql ) -1 )
    select @result = geometry::STGeomFromText( 'MULTILINESTRING('+ @sql + N')', 1 )
  end

...and finally, given:

DECLARE @Geom TABLE 
( 
   shape geometry, 
   shapeType nvarchar(50) 
); 

INSERT @Geom(shape,shapeType) VALUES
  ('LINESTRING(1 2, 3 4)', 'A'), 
  ('LINESTRING(3 4, 9 9)', 'B'),  --> disjoint from here to the next LINESTRING
  ('LINESTRING(9 8, 3 4)', 'C'),
  ('LINESTRING(3 4, 1 2)', 'D'); 

select 
  dbo.SimplifyToLine(geometry::CollectionAggregate(shape)).ToString(),
  dbo.SimplifyToLine(geometry::CollectionAggregate(shape))
from 
  @Geom

delete @Geom

INSERT @Geom(shape,shapeType) VALUES
('LINESTRING(1 2, 3 4)', 'A'), 
('LINESTRING(3 4, 9 8)', 'B'),
('LINESTRING(9 8, 3 4)', 'C'),
('LINESTRING(3 4, 1 2)', 'D'); 

select 
  dbo.SimplifyToLine(geometry::CollectionAggregate(shape)).ToString(),
  dbo.SimplifyToLine(geometry::CollectionAggregate(shape))
from
  @Geom

...you get:

this

like image 35
Clay Avatar answered Nov 16 '22 03:11

Clay


Going off Clay's idea of passing in a GeometryCollection I implemented a robust version that will take any combination of POINT, MULTIPOINT, LINESTRING, MULTILINESTRING and remove any touching endpoints within a @Tolerance and create a POINT, LINESTRING, MULTILINESTRING

Here is a demostration of it working (notice how the tolerance of 0 and 0.1 makes a difference for the 2nd and 3rd output):

DECLARE @GeometryCollection GEOMETRY = GEOMETRY::STGeomFromText('GEOMETRYCOLLECTION (LINESTRING (1 2, 3 4), LINESTRING (3 4, 100 100), LINESTRING (9 8, 3 4), LINESTRING (3 4, 1 2), POINT(1 2), POINT(1 2), POINT(1 2))',0)
SELECT [dbo].[fnSimplifyToLine](@GeometryCollection, 0).ToString();
--Output: MULTILINESTRING ((1 2, 3 4, 100 100), (9 8, 3 4, 1 2))

SET @GeometryCollection  = GEOMETRY::STGeomFromText('GEOMETRYCOLLECTION (LINESTRING (1 2, 3 4.1), LINESTRING (3 4, 9 9, 6 1))',0)
SELECT [dbo].[fnSimplifyToLine](@GeometryCollection, 0).ToString()
--Output: MULTILINESTRING ((1 2, 3 4.1), (3 4, 9 9, 6 1))

SET @GeometryCollection  = GEOMETRY::STGeomFromText('GEOMETRYCOLLECTION (LINESTRING (1 2, 3 4.1), LINESTRING (3 4, 9 9, 6 1))',0)
SELECT [dbo].[fnSimplifyToLine](@GeometryCollection, 0.1).ToString()
--Output: LINESTRING (1 2, 3 4.1, 9 9, 6 1)

SET @GeometryCollection  = GEOMETRY::STGeomFromText('GEOMETRYCOLLECTION (POINT(1 2))',0)
SELECT [dbo].[fnSimplifyToLine](@GeometryCollection, 0).ToString()
--Output: POINT (1 2)

SET @GeometryCollection  = GEOMETRY::STGeomFromText('GEOMETRYCOLLECTION (MULTIPOINT((1 2), (2 3)))',0)
SELECT [dbo].[fnSimplifyToLine](@GeometryCollection, 0).ToString()
--Output:  (1 2, 2 3)

First I had to create a recursive CTE function that takes a geometry and extracts all points.

CREATE FUNCTION [dbo].[fnGetPoints]
(   
    @Geometry GEOMETRY
)
RETURNS TABLE 
AS
RETURN 
(

    WITH GeometryPoints(N, Point) AS ( 
        SELECT 
            CAST(1 AS DECIMAL(9,2)) as N
            ,@Geometry.STPointN(1) as Point
        UNION ALL
        SELECT 
            CAST(N + 1.0 AS DECIMAL(9,2)) as N
            ,@Geometry.STPointN(N + 1) as Point
        FROM GeometryPoints GP
        WHERE N < @Geometry.STNumPoints()  
    )

    SELECT *
    FROM GeometryPoints
)

Then I created a function that CROSS APPLY fnGetPoints to every geometry in the @GeometryCollection to get a point matrix. Using windowed function (LAG) to find places where the endpoints are within a @Tolerance and remove those points. Then I did a data smear to combine the geometries where they shared endpoints.

CREATE FUNCTION [dbo].[fnSimplifyToLine] (@GeometryCollection GEOMETRY, @Tolerance DECIMAL(19,10))
RETURNS GEOMETRY
AS
BEGIN
    DECLARE @PointMatrix TABLE (
        PointId INT,
        LinestringId INT,
        GeometryIndex INT,
        GeometryType varchar(100),
        PointIndex INT,
        Point GEOMETRY,
        Duplicate BIT
    );

    DECLARE @Linestrings TABLE (
        LinestringId INT,
        PointArrayStr varchar(max)
    );

    WITH CollectionGeometries(N, Geom) AS ( 
        SELECT 
            CAST(1 AS DECIMAL(9,2)) as N
            ,@GeometryCollection.STGeometryN(1) as Geom
        UNION ALL
        SELECT 
            CAST(N + 1.0 AS DECIMAL(9,2)) as N
            , @GeometryCollection.STGeometryN(N + 1) as Geom
        FROM CollectionGeometries CG
        WHERE N < @GeometryCollection.STNumGeometries()
    ), PointMatrix AS (
        SELECT 
            ROW_NUMBER() OVER(ORDER BY G.N, P.N) as PointId
            ,G.N as GeometryIndex
            ,G.Geom.STGeometryType() as GeometryType
            ,P.N as PointIndex
            ,P.Point
        FROM CollectionGeometries G
        CROSS APPLY dbo.fnGetPoints(Geom) P
    )

    INSERT INTO @PointMatrix
    SELECT 
        PointId
        ,GeometryIndex as LinestringId
        ,GeometryIndex
        ,GeometryType
        ,PointIndex
        ,Point
        ,CASE 
            WHEN 
                GeometryIndex != LAG(GeometryIndex) OVER(ORDER BY PointId)
                AND ABS(Point.STX - LAG(Point.STX) OVER(ORDER BY PointId)) <= @Tolerance
                AND ABS(Point.STY - LAG(Point.STY) OVER(ORDER BY PointId)) <= @Tolerance
            THEN 1
            ELSE 0
        END as Duplicate
    FROM PointMatrix
    OPTION (MAXRECURSION 10000)



    -- POLYGON, MULTIPOLYGON, GEOMETRYCOLLECTION, CIRCULARSTRING, COMPOUNDCURVE, CURVEPOLYGON not supported
    IF EXISTS ( SELECT * FROM @PointMatrix WHERE GeometryType NOT IN ('POINT', 'MULTIPOINT', 'LINESTRING', 'MULTILINESTRING'))
        RETURN CAST('Geometries in @GeometryCollection must all be IN (''POINT'',''MULTIPOINT'', ''LINESTRING'', ''MULTILINESTRING'')' as GEOMETRY);

    DECLARE @SRID INT = (SELECT DISTINCT Point.STSrid FROM @PointMatrix)

    UPDATE @PointMatrix
    SET LinestringId = NULL
    WHERE GeometryIndex IN (
        SELECT GeometryIndex FROM @PointMatrix WHERE Duplicate = 1
    )

    DELETE @PointMatrix
    WHERE Duplicate = 1;

    -- Data smear
    WITH Cnt AS (
        SELECT PointId, Point, LinestringId,c=COUNT(LinestringId) OVER (ORDER BY PointId)
        FROM @PointMatrix
    ), SmearedLineStringId AS (
        SELECT PointId, Point, LinestringId=MAX(LinestringId) OVER (PARTITION BY c)
        FROM Cnt
    )

    INSERT @Linestrings
    SELECT 
        LinestringId
        ,'(' + 
                STUFF((
                    SELECT ',' + CAST(Point.STX as varchar(100)) + ' ' + CAST(Point.STY as varchar(100))
                    FROM SmearedLineStringId t2
                    WHERE t1.LinestringId = t2.LinestringId 
                    ORDER BY PointId
                    FOR XML PATH ('')
                ), 1, 1, '')
        + ')' as PointArray
    FROM SmearedLineStringId t1
    GROUP BY LinestringId

    DECLARE @Type varchar(100) = CASE 
        WHEN 1 =(SELECT COUNT(*) FROM @PointMatrix) THEN
            'POINT'
        WHEN 1 =(SELECT COUNT(*) FROM @Linestrings) THEN
            'LINESTRING'
        ELSE
            'MULTILINESTRING'
    END

    DECLARE @BeginParens char(1) = '(';
    DECLARE @EndParens char(1) = ')'

    IF @Type != 'MULTILINESTRING'
    BEGIN
        SET @BeginParens = '';
        SET @EndParens = '';
    END

    DECLARE @Wkt varchar(max) = @Type + @BeginParens + 
                STUFF((
                    SELECT ',' + PointArrayStr
                    FROM @Linestrings t2
                    ORDER BY LinestringId
                    FOR XML PATH ('')
                ), 1, 1, '')
            + @EndParens

    RETURN Geometry::STGeomFromText(@Wkt, @SRID)

END
GO
like image 41
ParoX Avatar answered Nov 16 '22 02:11

ParoX