Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge consecutive date range Oracle

I am facing a problem. I can't figure out how to merge consecutive date range rows together, based on two dimensions. One is OK for me, but second makes troubles

Let's imagine table in this structure with four possible scenarios

  emp_id  |  level  |  date_from   |   date_to    
--------------------------------------------------
    1     |   A     |  7/31/2015   |  3/31/2016
    1     |   A     |  4/1/2016    |  1/1/3000

    2     |   A     |  7/31/2015   |  1/1/3000

    3     |   A     |  5/31/2015   |  12/31/2015
    3     |   B     |  1/1/2016    |  3/31/2016
    3     |   A     |  4/1/2016    |  6/30/2016
    3     |   B     |  7/1/2016    |  1/1/3000

    4     |   A     |  5/31/2015   |  12/31/2015
    4     |   A     |  1/1/2016    |  6/30/2016
    4     |   B     |  7/1/2016    |  1/1/3000

I want to merge only those rows, that have consecutive date ranges and act_level = prev_level

I tried to do something like this

SELECT emp_id
, level
, date_from
, date_to
--
, CASE
    WHEN lag(level) over (partition by emp_id order by date_from) = level THEN 
         CASE
             WHEN lag(date_to) over (partition by emp_id, level order by date_from) = date_from-1 
               THEN lag(date_from) over (partition by code_employee, level_name order by date_from)
             ELSE NULL
         END
    ELSE 
         CASE
             WHEN lag(level) over (partition by emp_id order by date_from) = level
                     OR
                  lead(level) over (partition by emp_id order by date_from) = level
                THEN NULL
             ELSE date_from
         END
  END date_from_new
, date_to as date_to_new
--
FROM src_table
--
WHERE 1=1

this gives me nearly the results that I want:

  emp_id  |  level  |  date_from   |   date_to   |  d_from_new | d_from_to 
--------------------------------------------------------------------------
    1     |   A     |  7/31/2015   |  3/31/2016  |           | 3/31/2016
    1     |   A     |  4/1/2016    |  1/1/3000   | 7/31/2015 | 1/1/3000

    2     |   A     |  7/31/2015   |  1/1/3000   | 7/31/2015 | 1/1/3000

    3     |   A     |  5/31/2015   |  12/31/2015 | 5/31/2015 | 12/31/2015
    3     |   B     |  1/1/2016    |  3/31/2016  |  1/1/2016 | 3/31/2016
    3     |   A     |  4/1/2016    |  6/30/2016  |  4/1/2016 | 6/30/2016  
    3     |   B     |  7/1/2016    |  1/1/3000   |  7/1/2016 | 1/1/3000 

    4     |   A     |  5/31/2015   |  12/31/2015 |           | 12/31/2015
    4     |   A     |  1/1/2016    |  6/30/2016  | 5/31/2015 | 6/30/2016
    4     |   B     |  7/1/2016    |  1/1/3000   | 7/1/2016  | 1/1/3000

I will just filter the result for d_from_new (date_from_new) not null values. But I am not sure what's gonna happen if there will be for example 3x the same level with consecutive date range, or 8times.

And honestly - I don't like the query :)

Do you have any "perfomence-friendly" and "eye-friendly" solution?

like image 649
Mr.P Avatar asked Oct 29 '22 19:10

Mr.P


1 Answers

Please try this query:

select emp_id, lvl, min(date_from) df, max(date_to) dt
  from (
    select s2.*, rn - sum(marker) over (order by rn) as grp
      from (
        select s1.*,
               row_number() over (order by emp_id, date_from) rn,
               case when lag(lvl) over (partition by emp_id order by date_from) 
                         = lvl
                     and lag(date_to) over (partition by emp_id order by date_from) + 1 
                         = date_from
                    then 1
                    else 0
               end marker
          from src_table s1 ) s2 )
  group by emp_id, lvl, grp
  order by emp_id, min(date_from)

In first subquery S1 I added marker, where 1 is assigned if previous level is corresponding and dates are consecutive. In second subquery this marker is used to build GRP column which has the same values for all matching rows. This column is used in final grouping query to find minimum date_from and maximum date_to. Please run inner queries separately to see what happens in each step. Tested if there are more than two consecutive rows.

Test data and output:

create table src_table (emp_id number(6), lvl varchar2(2), date_from date, date_to date);
insert into src_table values (1, 'A', date '2015-07-31', date '2016-03-31');
insert into src_table values (1, 'A', date '2016-04-01', date '3000-01-01');
insert into src_table values (2, 'A', date '2015-07-31', date '3000-01-01');
insert into src_table values (3, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (3, 'B', date '2016-01-01', date '2016-03-31');
insert into src_table values (3, 'A', date '2016-04-01', date '2016-06-30');
insert into src_table values (3, 'B', date '2016-07-01', date '3000-01-01');
insert into src_table values (4, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (4, 'A', date '2016-01-01', date '2016-06-30');
insert into src_table values (4, 'B', date '2016-07-01', date '3000-01-01');

 EMP_ID LVL DF          DT
------- --- ----------- -----------
      1 A   2015-07-31  3000-01-01
      2 A   2015-07-31  3000-01-01
      3 A   2015-05-31  2015-12-31
      3 B   2016-01-01  2016-03-31
      3 A   2016-04-01  2016-06-30
      3 B   2016-07-01  3000-01-01
      4 A   2015-05-31  2016-06-30
      4 B   2016-07-01  3000-01-01

8 rows selected
like image 117
Ponder Stibbons Avatar answered Nov 15 '22 06:11

Ponder Stibbons