Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simplify the timeline in SQL(Netezza)

Tags:

sql

mysql

netezza

I want to summarize/simplify(I don't know what to call) my timeline.

So what I have are the ID with the timeline. I am trying to get rid of the overlap timeline within the same id.

Here are the example of the data. What I have:

ID   START_TIME   END_TIME
 1        a          b
 1        c          d
 1        e          f
 1        g          h

enter image description here
As you can see from the picture, [a,b],[c,d],[e,f] overlap each other and [g,h] are disjoint so I just want [a,f] and [g,h]. What I want:

ID   START_TIME   END_TIME
 1        a          f
 1        g          h
like image 743
devon Avatar asked Mar 14 '26 20:03

devon


1 Answers

I think @shA.T is close. The problem is in multiple overlaps this would break down. You might have to turn this into a multiple step process

Step 1 (Make Sample Table):

 create temp table stack (
 id integer
 ,start_time timestamp
 ,end_time timestamp
 )

 insert into stack  values(1, date('2020-01-01'),date('2020-01-01') + interval '3 hours');
 insert into stack  values(1,date('2020-01-01') + interval '2 hours',date('2020-01-01') + interval '4 hours');
 insert into stack  values(1,date('2020-01-01') + interval '3.5 hours',date('2020-01-01') + interval '5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '5.5 hours',date('2020-01-01') + interval '6.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '7.5 hours',date('2020-01-01') + interval '9.5 hours');
 insert into stack  values(1,date('2020-01-01') + interval '8.5 hours',date('2020-01-01') + interval '10.5 hours');

Step 2 (Find single overlaps):

create temp table stack2 as
 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack AS t1
        JOIN stack AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2

Step 3 (Consolidate Double Overlaps):

 SELECT ID, ps2 as start_time, max(e) AS End_Time
    FROM (
        SELECT t1.ID, t1.START_TIME AS s, MAX(t1.END_TIME) AS e,
               max(t2.START_TIME) As ps, MAX(t2.END_TIME) AS pe
               ,CASE WHEN pe between s and e THEN ps ELSE s END ps2
        FROM stack2 AS t1
        JOIN stack2 AS t2 ON t1.START_TIME > t2.START_TIME
        GROUP BY t1.ID, t1.START_TIME) AS DT
    GROUP BY
        ID, ps2
    ORDER BY ps2
like image 182
Niederee Avatar answered Mar 16 '26 11:03

Niederee