(PostgreSQL 8.4) Table "trackingMessages" stores tracking events between mobile devices (tm_nl_mobileid) and fixed devices (tm_nl_fixedId).
CREATE TABLE trackingMessages
(
tm_id SERIAL PRIMARY KEY, -- PK
tm_nl_mobileId INTEGER, -- FK to mobile
tm_nl_fixedId INTEGER, -- FK to fixed
tm_date INTEGER, -- Network time
tm_messageType INTEGER, -- 0=disconnect, 1=connect
CONSTRAINT tm_unique_row
UNIQUE (tm_nl_mobileId, tm_nl_fixedId, tm_date, tm_messageType)
);
Problem here is that it's possible that the same mobile will connect to the same fixed twice (or more times) subsequently. I don't want to see the subsequent ones, but it's OK to see a mobile connected to a same fixed at a later date, provided there was a connection to a different fixed in between.
I think I'm close but not quite. I've been using the following CTE (found here on Stack Overflow)
WITH cte AS
(
SELECT tm_nl_fixedid, tm_date, Row_number() OVER (
partition BY tm_nl_fixedid
ORDER BY tm_date ASC
) RN
FROM trackingMessages
)
SELECT * FROM cte
WHERE tm_nl_mobileid = 150 AND tm_messagetype = 1
ORDER BY tm_date;
Gives me the following results
32;1316538756;1
21;1316539069;1
32;1316539194;2
32;1316539221;3
21;1316539235;2
The problem here is that the last column should be 1, 1, 1, 2, 1, because that third "32" is in fact a duplicate tracking event (twice in a row at the same fixed) and that last connection to "21" is OK because "32" was in between.
Please don't suggest a cursor, this is what I am currently trying to move away from. The cursor solution does work, but it's too slow given the amount of records I have to deal with. I'd much rather fix the CTE and only select where RN = 1
... unless you have a better idea!
Well, you're not that close because row_number()
cannot track sequences by two groups at the same time. PARTITION BY tm_nl_fixedid ORDER BY date RESTART ON GAP
does not exist, there's no such thing.
Itzik Ben-Gan has a solution for the islands and gaps problem you are facing (several solutions, actually). The idea is to order rows by the main criteria (date) and then by partitioning criteria + main criteria. Difference between ordinals will remain the same as they belong to the same partitioning criteria and date series.
with cte as
(
select *,
-- While order by date and order by something-else, date
-- run along, they belong to the same sequence
row_number() over (order by tm_date)
- row_number() over (order by tm_nl_fixedid, tm_date) grp
from trackingMessages
)
select *,
-- Now we can get ordinal number grouped by each sequence
row_number() over (partition by tm_nl_fixedid, grp
order by tm_date) rn
from cte
order by tm_date
Here is Sql Fiddle with example.
And here is chapter 5 of Sql Server MVP Deep Dives with several solutions to islands and gaps problem.
This should be simpler with the window function lag()
:
WITH cte AS (
SELECT *
,lag(tm_nl_fixedId) OVER (PARTITION BY tm_nl_mobileId
ORDER BY tm_date) AS last_fixed
FROM trackingmessages
)
SELECT *
FROM cte
WHERE last_fixed IS DISTINCT FROM tm_nl_fixedId
ORDER BY tm_date
In the CTE, lag()
gets the last fixed device to which a mobile connected (NULL
for the first row per mobile - that's why I use IS DISTINCT FROM
later, see a different approach here).
Then simply exclude all rows where the last fixed device was the same as this one, thereby excluding all "subsequent ones". All done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With