Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging/extending records with identical type in SQL Server

I'm having problems with merging records of identical types with a consecutive sequence and calculating the full sequence from the merged records.

The ordering should be done on the basis ID as the sequences might rollover to 0 when they reach 100. See last entry in input/output example.

Is it possible to have an input as listed below and have a query that produces the output which is also listed below in SQL Server 2012?

Input

Id     Type     Begin     End
-----------------------------
1      1        10        20
2      1        21        23
3      2        24        28
4      1        29        40
5      2        41        47
6      2        48        50
7      2        75        80
8      1        81        100
9      1        0         10
10     1        11        20
11     1        21        5
12     1        5         6

Output

FromId     ToId     Type     Begin     End    Length
----------------------------------------------------
1          2        1        10        23     13 (23-19)
3          3        2        24        28     4  (28-24)
4          4        1        29        40     11 (40-29)
5          6        2        41        50     9  (50-41)
7          7        2        75        80     5  (80 - 75)
8          12       1        81        20     227*

*(100-81) + 10 + (100-11 + 20) + (100-21 + 5) + 1 -> rolloverS of seq

EDIT

Please note that rows 6 and 7 from the source are not merged because they are not consecutive. Row 6 ends with 50 and row 7 starts with 75. Only consecutive rows with the same type need to be merged.

like image 286
Frank Avatar asked May 10 '13 09:05

Frank


2 Answers

hm... very interesting task I have ended up with the following results

Type    IntervalBegin   CurrEnd
1   10  20
1   NULL    23
2   24  28
1   29  40
2   41  47
2   NULL    50
2   75  80
1   81  100
1   NULL    10
1   10  20

but I am still puzzled about aggregating recieved results...

the query is below

DECLARE @MyTable TABLE ([Id] INT, [Type] INT, [Begin] INT, [End] INT)

INSERT INTO @MyTable([Id], [Type], [Begin], [End] )
VALUES
    (1, 1, 10, 20),
    (2, 1, 21, 23),
    (3, 2, 24, 28),
    (4, 1, 29, 40),
    (5, 2, 41, 47),
    (6, 2, 48, 50),
    (7, 2, 75, 80),
    (8, 1, 81, 100),
    (9, 1, 0, 10),
    (10, 1, 10, 20)

    SELECT 
        [Type],
        CASE
            WHEN ShouldCompareWithPrevious = 1 AND PrevBegin IS NULL THEN CurrBegin 
            WHEN ShouldCompareWithPrevious = 1 AND PrevEnd = 100 AND CurrBegin = 0 THEN NULL
            WHEN ShouldCompareWithPrevious = 1 AND PrevEnd + 1 <> CurrBegin THEN CurrBegin
            WHEN ShouldCompareWithPrevious = 0 THEN CurrBegin 
            ELSE NULL
        END IntervalBegin,
        CurrEnd
    FROM
    (

        SELECT t1.[Id], t2.[Id] t2Id,
            t1.[Type], t2.[Type] t2Type,
            (
                CASE
                 WHEN t2.[Type]  IS NULL THEN 0
                 WHEN t2.[Type] = t1.[Type] THEN 1
                 ELSE
                    0
                END
            ) AS ShouldCompareWithPrevious, 
            t1.[Begin] CurrBegin,
            t1.[End] CurrEnd,
            t2.[Begin] PrevBegin, 
            t2.[End] PrevEnd
        FROM @MyTable t1
        LEFT OUTER JOIN @MyTable t2
            ON t1.Id = t2.Id + 1
    ) intermideate
like image 43
Yaugen Vlasau Avatar answered Oct 15 '22 02:10

Yaugen Vlasau


You last row have Begin = 10 which is not follow the same rule with others. I update it 11 in my example. Hope this will help.

SQL Fiddler

WITH typeRowNum AS (  
  SELECT *, ROW_NUMBER() OVER (ORDER BY Id ASC) AS rownum
  FROM tblType
)
,rw AS (
  SELECT t1.*, 
    CASE WHEN t1.[type] = t2.[type] and ( t1.[Begin] = t2. [end] + 1 OR t1.[Begin] + 100 = t2.[end])
      THEN -1 
      ELSE t1.rownum 
    END AS group_id
  FROM typeRowNum t1
  LEFT JOIN typeRowNum t2 
    ON t2.rownum = t1.rownum - 1
 )
, cte AS (
SELECT *,
  new_end = ISNULL(
    (SELECT MIN(rownum) - 1 FROM rw r2 WHERE r2.rownum > r1.rownum and r2.group_id > r1.group_id),
    (SELECT MAX(rownum) FROM rw)
  )
FROM rw r1
WHERE r1.group_id > 0
 )
select 
  c1.id,c1.type,c1.[begin],c2.[end]
 ,[length] = (SELECT SUM((r.[end]  - r.[Begin]
                  + CASE WHEN r.[end] < r.[Begin] THEN 100 ELSE 0 END 
                  + CASE WHEN (r.group_id = -1) AND (r.[Begin] < r.[End]) THEN 1 ELSE 0 END)
                 ) 
             FROM rw r WHERE r.rownum  BETWEEN c1.[rownum] AND c2.[rownum])  
FROM cte c1
LEFT JOIN rw c2
  ON c1.new_end = c2.rownum

UPDATE: If you have NULL value, most likely you have some discontinued value in [Id] Column. Instead, you can use Row_Number to JOIN. I updated my answer above.

like image 107
EricZ Avatar answered Oct 15 '22 02:10

EricZ