I have some data that represents different 'actions'. These 'actions' collectively comprise an 'event'.
The data looks like this:
EventID | UserID | Action | TimeStamp
--------------+------------+------------+-------------------------
1 | 111 | Start | 2012-01-01 08:00:00
1 | 111 | Stop | 2012-01-01 08:59:59
1 | 999 | Start | 2012-01-01 09:00:00
1 | 999 | Stop | 2012-01-01 09:59:59
1 | 111 | Start | 2012-01-01 10:00:00
1 | 111 | Stop | 2012-01-01 10:30:00
As you can see, each single 'event' is made of one or more 'Actions' (or as I think of them, 'sub events').
I need to identify each 'sub event' and give it an identifier. This is what I am looking for:
EventID | SubeventID | UserID | Action | TimeStamp
--------------+----------------+------------+------------+-------------------------
1 | 1 | 111 | Start | 2012-01-01 08:00:00
1 | 1 | 111 | Stop | 2012-01-01 08:59:59
1 | 2 | 999 | Start | 2012-01-01 09:00:00
1 | 2 | 999 | Stop | 2012-01-01 09:59:59
1 | 3 | 111 | Start | 2012-01-01 10:00:00
1 | 3 | 111 | Stop | 2012-01-01 10:30:00
I need something that can start counting, but only increment when some column has a specific value (like "Action" = 'Start').
I have been trying to use Window Functions for this, but with limited success. I just can't seem to find a solution that I feel will work... any thoughts?
If you have some field you can sort by, you could use the following query (untested):
SELECT
sum(("Action" = 'Start')::int) OVER (PARTITION BY "EventID" ORDER BY "Timestamp" ROWS UNBOUNDED PRECEDING)
FROM
events
Note that if the first SubEvent does not start with Start, it will have an event id of 0, which might not be what you want.
You could also use COUNT()
in place of SUM()
:
SELECT
EventID
, COUNT(CASE WHEN Action = 'Start' THEN 1 END)
OVER ( PARTITION BY EventID
ORDER BY TimeStamp
ROWS UNBOUNDED PRECEDING )
AS SubeventID
, UserID
, Action
FROM
tableX AS t ;
Tests at SQL-Fiddle: test
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With