Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Oracle timeline report from overlapping intervals

I am using Oracle 10g, and I have a table with overlapping intervals. In a simplified form:

create TABLE INTERVAL_TEST (
    STARTDATE DATE NOT NULL,
    ENDDATE DATE,
    ITEM VARCHAR2(100) NOT NULL
)

insert into INTERVAL_TEST (STARTDATE, ENDDATE, ITEM) values (to_date('2012-01-01', 'YYYY-MM-DD'), null, 'AAA');
insert into INTERVAL_TEST (STARTDATE, ENDDATE, ITEM) values (to_date('2012-02-01', 'YYYY-MM-DD'), to_date('2012-03-01', 'YYYY-MM-DD'), 'BBB');

As you see, there are overlapping entries. The null in the first one means that this is valid "forever". I need to generate a report that shows the history on a timeline, listing the number of valid items for each line. There should be no time-overlap in the output, instead all the valid items should be listed for each subinterval where there was no change:

START       END          VALID_ITEMS
----------  ----------   --------------
2012-01-01  2012-02-01   AAA
2012-02-01  2012-03-01   AAA, BBB
2012-03-01  NULL         AAA

Is there a SQL statement that generates such a result?

like image 712
lbalazscs Avatar asked Oct 07 '22 07:10

lbalazscs


1 Answers

You can do it in one whooping statement:

SQL> WITH timeline AS
  2          (SELECT mydate startdate,
  3                  lead(mydate) OVER (ORDER BY mydate) - 1 enddate
  4             FROM (SELECT startdate mydate FROM interval_test
  5                   UNION
  6                   SELECT enddate FROM interval_test)
  7            WHERE mydate IS NOT NULL)
  8  SELECT startdate,
  9         enddate,
 10         max(substr(sys_connect_by_path(item, ','), 2)) items
 11    FROM (SELECT t.startdate,
 12                 t.enddate,
 13                 item,
 14                 row_number() OVER (PARTITION BY t.startdate, t.enddate
 15                                    ORDER BY i.item) rn
 16            FROM    timeline t
 17                 JOIN
 18                    interval_test i
 19                 ON nvl(i.enddate, DATE '9999-12-31') - 1 >= t.startdate
 20                AND i.startdate <= nvl(t.enddate, DATE '9999-12-31'))
 21  START WITH rn = 1
 22  CONNECT BY rn = PRIOR rn + 1
 23         AND startdate = PRIOR startdate
 24  GROUP BY startdate, enddate
 25  ORDER BY startdate;

STARTDATE  ENDDATE    ITEMS
---------- ---------- --------------------
2012-01-01 2012-01-31 AAA
2012-02-01 2012-02-29 AAA,BBB
2012-03-01            AAA

I used a first subquery to list all intervals:

SQL> SELECT mydate startdate,
  2                  lead(mydate) OVER (ORDER BY mydate) - 1 enddate
  3             FROM (SELECT startdate mydate FROM interval_test
  4                   UNION
  5                   SELECT enddate FROM interval_test)
  6            WHERE mydate IS NOT NULL;

STARTDATE  ENDDATE
---------- ----------
2012-01-01 2012-01-31
2012-02-01 2012-02-29
2012-03-01

joined to the following query that lists all items on one row given two dates:

SELECT max(substr(sys_connect_by_path(item, ','), 2)) items
  FROM (SELECT item, row_number() OVER (ORDER BY item) rn
          FROM interval_test
         WHERE nvl(enddate, DATE '9999-12-31') >= :startdate
           AND startdate <= :enddate)
CONNECT BY rn = PRIOR rn + 1
START WITH rn = 1;
like image 176
Vincent Malgrat Avatar answered Oct 13 '22 12:10

Vincent Malgrat