Assign stage identifier to a table of start-stop events

Question

I would like to add a computed column to a table of start-stop events to act as a 'stage' identifier. Unfortunately, my data from the source system is not always clean, by which I mean there won't always be a pair of start and stop events; some times there are stop events without a corresponding start, and conversely a start event without a corresponding stop.

The table below shows my test data. The penultimate column is my incorrect attempt at computing this column using a windowed function. The final column shows the correct expect result I'm looking for.

The columns are:

id: A 'job' identifier to which start stop events belong.
ordinal: The order of the events within a job.
category: Whether the event is a start or stop. a = start, b = stop.
Incorrect: My incorrect attempt at computing this stage id.
Correct: The result I'm aiming for.

id ordinal category Incorrect Correct 1 1 a 1 1 1 2 b 1 1 1 3 a 2 2 1 4 b 2 2 2 1 a 1 1 2 2 b 1 1 2 3 a 2 2 2 4 b 2 2 2 5 a 3 3 3 1 b 1 1 3 2 a 1 2 3 3 b 2 2 3 4 a 2 3 3 5 b 3 3

As can be seen, my attempt works as intended for ids 1 and 2, but not for 3 where it doesn't group the events into stages as required.

My query so far is:

select *,
       Incorrect = row_number() over (partition by id, category order by ordinal)
from #Test
order by id, ordinal

Hope this is clear enough. Thanks for the help.

gotqn · Accepted Answer

You can try this:

SELECT *
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
FROM @DataSource
ORDER BY [id], [ordinal];

enter image description here

Here is the full code:

DECLARE @DataSource TABLE
(
     [id] TINYINT
    ,[ordinal] TINYINT
    ,[category] CHAR(1)
);

INSERT INTO @DataSource ([id], [ordinal], [category])
VALUES   ('1', '1', 'a')
        ,('1', '2', 'b')
        ,('1', '3', 'a')
        ,('1', '4', 'b')
        ,('2', '1', 'a')
        ,('2', '2', 'b')
        ,('2', '3', 'a')
        ,('2', '4', 'b')
        ,('2', '5', 'a')
        ,('3', '1', 'b')
        ,('3', '2', 'a')
        ,('3', '3', 'b')
        ,('3', '4', 'a')
        ,('3', '5', 'b');


SELECT *
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 
FROM @DataSource
ORDER BY [id], [ordinal];

Note, that, in cases, when the id group's first element is b then the final columns values starts from 0. If this is a issue you can try to add 1 in such cases like this:

SELECT  DS.*
       ,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + dsf.[fix]
FROM @DataSource DS
INNER JOIN
(
    SELECT [id] as fix_id
          ,IIF(category = 'a', 0, 1) as fix
    FROM @DataSource
    WHERe [ordinal] = 1
) DSF
    ON DS.id = DSF.[fix_id]
ORDER BY [id], [ordinal];

enter image description here

Assign stage identifier to a table of start-stop events

Tags:

sql

sql-server

t-sql

Mark

1 Answers

gotqn

Recent Activity

Donate For Us

Assign stage identifier to a table of start-stop events

Tags:

sql

sql-server

t-sql

Mark

1 Answers

gotqn

Related questions

Recent Activity

Donate For Us