I would like to add a computed column to a table of start-stop events to act as a 'stage' identifier. Unfortunately, my data from the source system is not always clean, by which I mean there won't always be a pair of start and stop events; some times there are stop events without a corresponding start, and conversely a start event without a corresponding stop.
The table below shows my test data. The penultimate column is my incorrect attempt at computing this column using a windowed function. The final column shows the correct expect result I'm looking for.
The columns are:
id: A 'job' identifier to which start stop events belong.ordinal: The order of the events within a job.category: Whether the event is a start or stop. a = start, b = stop.Incorrect: My incorrect attempt at computing this stage id.Correct: The result I'm aiming for.id ordinal category Incorrect Correct
1 1 a 1 1
1 2 b 1 1
1 3 a 2 2
1 4 b 2 2
2 1 a 1 1
2 2 b 1 1
2 3 a 2 2
2 4 b 2 2
2 5 a 3 3
3 1 b 1 1
3 2 a 1 2
3 3 b 2 2
3 4 a 2 3
3 5 b 3 3
As can be seen, my attempt works as intended for ids 1 and 2, but not for 3 where it doesn't group the events into stages as required.
My query so far is:
select *,
Incorrect = row_number() over (partition by id, category order by ordinal)
from #Test
order by id, ordinal
Hope this is clear enough. Thanks for the help.
You can try this:
SELECT *
,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM @DataSource
ORDER BY [id], [ordinal];

Here is the full code:
DECLARE @DataSource TABLE
(
[id] TINYINT
,[ordinal] TINYINT
,[category] CHAR(1)
);
INSERT INTO @DataSource ([id], [ordinal], [category])
VALUES ('1', '1', 'a')
,('1', '2', 'b')
,('1', '3', 'a')
,('1', '4', 'b')
,('2', '1', 'a')
,('2', '2', 'b')
,('2', '3', 'a')
,('2', '4', 'b')
,('2', '5', 'a')
,('3', '1', 'b')
,('3', '2', 'a')
,('3', '3', 'b')
,('3', '4', 'a')
,('3', '5', 'b');
SELECT *
,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM @DataSource
ORDER BY [id], [ordinal];
Note, that, in cases, when the id group's first element is b then the final columns values starts from 0. If this is a issue you can try to add 1 in such cases like this:
SELECT DS.*
,SUM(IIF([category] = 'a', 1, 0)) OVER (PARTITION BY [id] ORDER BY [ordinal] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) + dsf.[fix]
FROM @DataSource DS
INNER JOIN
(
SELECT [id] as fix_id
,IIF(category = 'a', 0, 1) as fix
FROM @DataSource
WHERe [ordinal] = 1
) DSF
ON DS.id = DSF.[fix_id]
ORDER BY [id], [ordinal];

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With