Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign stage identifier to a table of start-stop events

I would like to add a computed column to a table of start-stop events to act as a 'stage' identifier. Unfortunately, my data from the source system is not always clean, by which I mean there won't always be a pair of start and stop events; some times there are stop events without a corresponding start, and conversely a start event without a corresponding stop.

The table below shows my test data. The penultimate column is my incorrect attempt at computing this column using a windowed function. The final column shows the correct expect result I'm looking for.

The columns are:

  • id: A 'job' identifier to which start stop events belong.
  • ordinal: The order of the events within a job.
  • category: Whether the event is a start or stop. a = start, b = stop.
  • Incorrect: My incorrect attempt at computing this stage id.
  • Correct: The result I'm aiming for.

id ordinal category Incorrect Correct 1 1 a 1 1 1 2 b 1 1 1 3 a 2 2 1 4 b 2 2 2 1 a 1 1 2 2 b 1 1 2 3 a 2 2 2 4 b 2 2 2 5 a 3 3 3 1 b 1 1 3 2 a 1 2 3 3 b 2 2 3 4 a 2 3 3 5 b 3 3

As can be seen, my attempt works as intended for ids 1 and 2, but not for 3 where it doesn't group the events into stages as required.

My query so far is:

select *,
       Incorrect = row_number() over (partition by id, category order by ordinal)
from #Test
order by id, ordinal

Hope this is clear enough. Thanks for the help.

like image 599
Mark Avatar asked Nov 27 '25 23:11

Mark


1 Answers

You can try this:

SELECT *
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
FROM @DataSource
ORDER BY [id], [ordinal];

enter image description here

Here is the full code:

DECLARE @DataSource TABLE
(
     [id] TINYINT
    ,[ordinal] TINYINT
    ,[category] CHAR(1)
);

INSERT INTO @DataSource ([id], [ordinal], [category])
VALUES   ('1', '1', 'a')
        ,('1', '2', 'b')
        ,('1', '3', 'a')
        ,('1', '4', 'b')
        ,('2', '1', 'a')
        ,('2', '2', 'b')
        ,('2', '3', 'a')
        ,('2', '4', 'b')
        ,('2', '5', 'a')
        ,('3', '1', 'b')
        ,('3', '2', 'a')
        ,('3', '3', 'b')
        ,('3', '4', 'a')
        ,('3', '5', 'b');


SELECT *
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
FROM @DataSource
ORDER BY [id], [ordinal];

Note, that, in cases, when the id group's first element is b then the final columns values starts from 0. If this is a issue you can try to add 1 in such cases like this:

SELECT  DS.*
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + dsf.[fix]
FROM @DataSource DS
INNER JOIN
(
    SELECT [id] as fix_id
          ,IIF(category = 'a', 0, 1) as fix
    FROM @DataSource
    WHERe [ordinal] = 1
) DSF
    ON DS.id = DSF.[fix_id]
ORDER BY [id], [ordinal];

enter image description here

like image 130
gotqn Avatar answered Nov 29 '25 11:11

gotqn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!