Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL select items that make datetime range between flag toggle

Say I have a table like this one:

CREATE TABLE TESTTABLE (
  ID Integer NOT NULL,
  ATMOMENT Timestamp NOT NULL,
  ISALARM Integer NOT NULL,
  CONSTRAINT PK_TESTTABLE PRIMARY KEY (ID)
);

It has ISALARM flag that toggles between 0 and 1 at random moments ATMOMENT, like in this example dataset:

INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('1', '01.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('2', '01.01.2016, 00:01:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('3', '01.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('4', '01.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('10', '02.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('11', '02.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('12', '02.01.2016, 00:01:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('20', '03.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('21', '03.01.2016, 00:01:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('22', '03.01.2016, 00:02:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('23', '03.01.2016, 00:02:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('30', '04.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('31', '04.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('32', '04.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('33', '04.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('40', '05.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('41', '05.01.2016, 00:00:00.000', '1');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('42', '05.01.2016, 00:00:00.000', '0');
INSERT INTO TESTTABLE (ID, ATMOMENT, ISALARM) VALUES ('43', '05.01.2016, 00:00:00.000', '0');

I need to select all alarm ranges, i.e. the ATMOMENT ranges where ISALARM is set to 1 (first time after previous range is closed) at range begin and reset back to 0 at range end. Say for clarity first reset is enough to close such range; say also that the simultaneous ISALARM set and reset are treated like the range end (while possibly as the begin). Example dataset above is expected to produce something like this:

 ALARMBEGIN                |  LASTALARMBEGIN            |  ALARMEND
-------------------------- | -------------------------- | --------
'01.01.2016, 00:00:00.000' | '01.01.2016, 00:01:00.000' | '01.01.2016, 00:02:00.000'
'02.01.2016, 00:00:00.000' | '02.01.2016, 00:00:00.000' | '02.01.2016, 00:01:00.000'
'03.01.2016, 00:00:00.000' | '03.01.2016, 00:02:00.000' | '03.01.2016, 00:02:00.000'
'04.01.2016, 00:00:00.000' | '04.01.2016, 00:00:00.000' | '04.01.2016, 00:00:00.000'
'05.01.2016, 00:00:00.000' | '05.01.2016, 00:00:00.000' | '05.01.2016, 00:00:00.000'

My own solution to this (below) looks pretty ugly and runs stunningly slow (about 1minute) even if the TESTTABLE has relatively small dataset with only ~2500 records (tested it with Firebird2.5 and Postgresql; I'm not good with DB optimization; "CREATE INDEX IDX_TESTTABLE1 ON TESTTABLE (ATMOMENT,ISALARM)" helps but not very much).

It is pretty strange for me because simple linear iteration on all TESTTABLE records (ordered by ATMOMENT) while comparing ISALARM field to one of the previous record gives me the ranges I want much faster.

Are there any elegant solution to make SQL select this faster and in cleaner way?

SELECT DISTINCT a1.ATMOMENT AS ALARMBEGIN, a2.ATMOMENT AS LASTALARMBEGIN, a3.ATMOMENT AS ALARMEND
FROM TESTTABLE a1
JOIN TESTTABLE a2 ON 
    (a1.ATMOMENT<a2.ATMOMENT
        AND NOT EXISTS(SELECT * FROM TESTTABLE x WHERE 
            x.ISALARM=0 AND a1.ATMOMENT<=x.ATMOMENT AND x.ATMOMENT<a2.ATMOMENT))
    OR (a1.ATMOMENT=a2.ATMOMENT)
JOIN TESTTABLE a3 ON 
    (a2.ATMOMENT<a3.ATMOMENT
        AND NOT EXISTS(SELECT * FROM TESTTABLE x WHERE 
            (x.ISALARM=0 AND a2.ATMOMENT<=x.ATMOMENT AND x.ATMOMENT<a3.ATMOMENT)
            OR (x.ISALARM=1 AND a2.ATMOMENT<x.ATMOMENT AND x.ATMOMENT<=a3.ATMOMENT)))
    OR (a2.ATMOMENT=a3.ATMOMENT)
WHERE a1.ISALARM<>0 AND a2.ISALARM<>0 AND a3.ISALARM=0
    AND (NOT EXISTS(SELECT * FROM TESTTABLE x1 WHERE
            x1.ATMOMENT<a1.ATMOMENT)
        OR EXISTS(SELECT * FROM TESTTABLE x1 WHERE
            x1.ISALARM=0
            AND x1.ATMOMENT<a1.ATMOMENT
            AND NOT EXISTS(SELECT * FROM TESTTABLE x2 WHERE
                x1.ATMOMENT<x2.ATMOMENT AND x2.ATMOMENT<a1.ATMOMENT)))
ORDER BY a1.ATMOMENT

Thank you.

Upd 1

Thanks to Gordon Linoff's and Jayvee's solutions (which are very good with Firebird3.0 and PostgreSQL) I've decided to rely on ordering efficiency of Firebird2.5 and contrived the "select" which is even uglier than my previous one but runs significantly faster. For those who need it done with Firebird2.5:

WITH 
GROUPEDTABLE_TT (ATMOMENT, NOTISALARMRESET, ISALARMSET)
AS(
SELECT a.ATMOMENT, MIN(a.ISALARM), MAX(a.ISALARM)
FROM TESTTABLE a
GROUP BY a.ATMOMENT),

INTERVALBEGIN_TT 
AS(
SELECT a1.ATMOMENT
FROM GROUPEDTABLE_TT a1
WHERE 
    a1.ISALARMSET<>0
    AND (NOT EXISTS (SELECT * FROM GROUPEDTABLE_TT x WHERE
            x.ATMOMENT<a1.ATMOMENT)
        OR (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
            x.ATMOMENT<a1.ATMOMENT
            ORDER BY x.ATMOMENT DESC)=0)),

INTERVALLAST_TT 
AS(
SELECT a2.ATMOMENT FROM GROUPEDTABLE_TT a2
WHERE a2.ISALARMSET=1
    AND (a2.NOTISALARMRESET=0
        OR (a2.NOTISALARMRESET=1 
            AND (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
                x.ATMOMENT>a2.ATMOMENT
                ORDER BY x.ATMOMENT ASC)=0
            AND (SELECT FIRST 1 x.ISALARMSET FROM GROUPEDTABLE_TT x WHERE
                x.ATMOMENT>a2.ATMOMENT
                ORDER BY x.ATMOMENT ASC)=0))),

INTERVALEND_TT 
AS(
SELECT a1.ATMOMENT
FROM GROUPEDTABLE_TT a1
WHERE 
    a1.NOTISALARMRESET=0
    AND (a1.ISALARMSET=1 
        OR (a1.ISALARMSET=0 
            AND (SELECT FIRST 1 x.ISALARMSET FROM GROUPEDTABLE_TT x WHERE
                x.ATMOMENT<a1.ATMOMENT
                ORDER BY x.ATMOMENT DESC)=1
            AND (SELECT FIRST 1 x.NOTISALARMRESET FROM GROUPEDTABLE_TT x WHERE
                x.ATMOMENT<a1.ATMOMENT
                ORDER BY x.ATMOMENT DESC)=1))),

ENCLOSEDINTERVALS_TT (BEGINMOMENT, LASTBEGINMOMENT, ENDMOMENT)
AS(
SELECT ib.ATMOMENT, 
    (SELECT FIRST 1 il.ATMOMENT FROM INTERVALLAST_TT il WHERE 
        ib.ATMOMENT<=il.ATMOMENT ORDER BY il.ATMOMENT ASC),
    (SELECT FIRST 1 ie.ATMOMENT FROM INTERVALEND_TT ie WHERE 
        ib.ATMOMENT<=ie.ATMOMENT ORDER BY ie.ATMOMENT ASC)
FROM INTERVALBEGIN_TT ib)

SELECT * FROM ENCLOSEDINTERVALS_TT
ORDER BY BEGINMOMENT

Upd 2 ...but my selects seems to show quadratic growth (or at least faster then linear) of the fetch number depending of the total record number; it's better to use procedure with single-pass linear iteration for FB2.5. Or to use FB30 with solutions below...

like image 340
N.Cherney Avatar asked Oct 17 '16 10:10

N.Cherney


1 Answers

This has been tested in PostgreSQL, the idea is create 3 ordered common tables for beginnings, last beginnings and ends respectively and then join the 3 tables.

It can be done with less code by creating only one CTE and flagging the rows with a case statement and then a selfjoin, which you can do later but in this way the code is more self explanatory and should be fairly efficient too.

;
with beginnings
as
(
    select atmoment, row_number() over(order by atmoment) rn from
    (
        select *, lag(atmoment,1) over(order by atmoment,isalarm desc) prevtime, 
          lag(isalarm,1) over(order by atmoment,isalarm desc) prevstatus
        from testtable 
    ) t
    where coalesce(prevstatus,0)=0 and isalarm=1
),
ends 
as
(
    select atmoment, row_number() over(order by atmoment) rn from
    (
        select *, lead(atmoment,1) over(order by atmoment,isalarm) nexttime, 
          lead(isalarm,1) over(order by atmoment,isalarm) nextstatus
        from testtable 
    ) t
    where coalesce(nextstatus,1)=1 and isalarm=0
),
lastbeginnings
as
(
    select atmoment, row_number() over(order by atmoment) rn from
    (
        select *, lead(atmoment,1) over(order by atmoment,isalarm desc) nexttime, 
          lead(isalarm,1) over(order by atmoment,isalarm desc) nextstatus
        from testtable 
    ) t
    where coalesce(nextstatus,0)=0 and isalarm=1
)

select b.atmoment ALARMBEGIN, lb.atmoment LASTALARMBEGIN, e.atmoment  ALARMEND 
from beginnings b
join lastbeginnings lb on lb.rn=b.rn
join ends e on e.rn=b.rn

result:

> 2016-01-01 00:00:00 | 2016-01-01 00:01:00 | 2016-01-01 00:02:00
> 2016-01-02 00:00:00 | 2016-01-02 00:00:00 | 2016-01-02 00:01:00
> 2016-01-03 00:00:00 | 2016-01-03 00:02:00 | 2016-01-03 00:02:00
> 2016-01-04 00:00:00 | 2016-01-04 00:00:00 | 2016-01-04 00:00:00
> 2016-01-05 00:00:00 | 2016-01-05 00:00:00 | 2016-01-05 00:00:00
like image 196
Jayvee Avatar answered Oct 27 '22 23:10

Jayvee