I am facing a problem. I can't figure out how to merge consecutive date range rows together, based on two dimensions. One is OK for me, but second makes troubles
Let's imagine table in this structure with four possible scenarios
emp_id | level | date_from | date_to
--------------------------------------------------
1 | A | 7/31/2015 | 3/31/2016
1 | A | 4/1/2016 | 1/1/3000
2 | A | 7/31/2015 | 1/1/3000
3 | A | 5/31/2015 | 12/31/2015
3 | B | 1/1/2016 | 3/31/2016
3 | A | 4/1/2016 | 6/30/2016
3 | B | 7/1/2016 | 1/1/3000
4 | A | 5/31/2015 | 12/31/2015
4 | A | 1/1/2016 | 6/30/2016
4 | B | 7/1/2016 | 1/1/3000
I want to merge only those rows, that have consecutive date ranges and act_level = prev_level
I tried to do something like this
SELECT emp_id
, level
, date_from
, date_to
--
, CASE
WHEN lag(level) over (partition by emp_id order by date_from) = level THEN
CASE
WHEN lag(date_to) over (partition by emp_id, level order by date_from) = date_from-1
THEN lag(date_from) over (partition by code_employee, level_name order by date_from)
ELSE NULL
END
ELSE
CASE
WHEN lag(level) over (partition by emp_id order by date_from) = level
OR
lead(level) over (partition by emp_id order by date_from) = level
THEN NULL
ELSE date_from
END
END date_from_new
, date_to as date_to_new
--
FROM src_table
--
WHERE 1=1
this gives me nearly the results that I want:
emp_id | level | date_from | date_to | d_from_new | d_from_to
--------------------------------------------------------------------------
1 | A | 7/31/2015 | 3/31/2016 | | 3/31/2016
1 | A | 4/1/2016 | 1/1/3000 | 7/31/2015 | 1/1/3000
2 | A | 7/31/2015 | 1/1/3000 | 7/31/2015 | 1/1/3000
3 | A | 5/31/2015 | 12/31/2015 | 5/31/2015 | 12/31/2015
3 | B | 1/1/2016 | 3/31/2016 | 1/1/2016 | 3/31/2016
3 | A | 4/1/2016 | 6/30/2016 | 4/1/2016 | 6/30/2016
3 | B | 7/1/2016 | 1/1/3000 | 7/1/2016 | 1/1/3000
4 | A | 5/31/2015 | 12/31/2015 | | 12/31/2015
4 | A | 1/1/2016 | 6/30/2016 | 5/31/2015 | 6/30/2016
4 | B | 7/1/2016 | 1/1/3000 | 7/1/2016 | 1/1/3000
I will just filter the result for d_from_new (date_from_new) not null values. But I am not sure what's gonna happen if there will be for example 3x the same level with consecutive date range, or 8times.
And honestly - I don't like the query :)
Do you have any "perfomence-friendly" and "eye-friendly" solution?
Please try this query:
select emp_id, lvl, min(date_from) df, max(date_to) dt
from (
select s2.*, rn - sum(marker) over (order by rn) as grp
from (
select s1.*,
row_number() over (order by emp_id, date_from) rn,
case when lag(lvl) over (partition by emp_id order by date_from)
= lvl
and lag(date_to) over (partition by emp_id order by date_from) + 1
= date_from
then 1
else 0
end marker
from src_table s1 ) s2 )
group by emp_id, lvl, grp
order by emp_id, min(date_from)
In first subquery S1
I added marker, where 1 is assigned if previous level is corresponding and dates are consecutive. In second subquery this marker is used to build GRP
column which has the same values for all matching rows. This column is used in final grouping query to find minimum date_from
and maximum date_to
. Please run inner queries separately to see what happens in each step. Tested if there are more than two consecutive rows.
Test data and output:
create table src_table (emp_id number(6), lvl varchar2(2), date_from date, date_to date);
insert into src_table values (1, 'A', date '2015-07-31', date '2016-03-31');
insert into src_table values (1, 'A', date '2016-04-01', date '3000-01-01');
insert into src_table values (2, 'A', date '2015-07-31', date '3000-01-01');
insert into src_table values (3, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (3, 'B', date '2016-01-01', date '2016-03-31');
insert into src_table values (3, 'A', date '2016-04-01', date '2016-06-30');
insert into src_table values (3, 'B', date '2016-07-01', date '3000-01-01');
insert into src_table values (4, 'A', date '2015-05-31', date '2015-12-31');
insert into src_table values (4, 'A', date '2016-01-01', date '2016-06-30');
insert into src_table values (4, 'B', date '2016-07-01', date '3000-01-01');
EMP_ID LVL DF DT
------- --- ----------- -----------
1 A 2015-07-31 3000-01-01
2 A 2015-07-31 3000-01-01
3 A 2015-05-31 2015-12-31
3 B 2016-01-01 2016-03-31
3 A 2016-04-01 2016-06-30
3 B 2016-07-01 3000-01-01
4 A 2015-05-31 2016-06-30
4 B 2016-07-01 3000-01-01
8 rows selected
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With