I have data like below. In this I need to find rows where the diff in their months should be >= 6. The logic should be, we need to compare from first row until we got the row where diff in months is 6, then need to follow the same logic from the newly matched row and it goes on.
I am adding below my sample data and expected output.
Fromdate LectureiD StudentID Diff Months Expected
1-Oct-13 1102 55586 null
15-Oct-13 1102 55586 0
15-Oct-13 1102 55586 0
4-Apr-14 1102 55586 6
19-Dec-14 1102 55586 8
27-Dec-14 1102 55586 0
14-Jan-15 1102 55586 0
14-Jan-15 1102 55586 0
29-Sep-15 1102 55586 8
1-Oct-13 1102 55557 null
15-Oct-13 1102 55557 0
15-Oct-13 1102 55557 0
4-Apr-14 1102 55557 6
19-Dec-14 1102 55557 8
Below is the logic I tried by using analytical function in oracle.
select lectureid,
studentid,
floor(months_between(fromdate,
lag(fromdate) over (partition by
lectureid,
studentid
order by fromdate
)
)
) monthdiff
from above_table;
Since the lag function will have the default offset of 1 it just checks the row only prior to that because of that the logic I mentioned above is not working properly here since that needs to checked against their prior rows dynamically based on the newly matched row. So the o/p am getting here is as below.
The rows highlighted with asterisk is getting wrong because of this since it compares only with its immediate prior row.
Fromdate LectureiD StudentID Diff Month
1-Oct-13 1102 55586 null
15-Oct-13 1102 55586 0
15-Oct-13 1102 55586 0
*4-Apr-14 1102 55586 5*
19-Dec-14 1102 55586 8
27-Dec-14 1102 55586 0
14-Jan-15 1102 55586 0
14-Jan-15 1102 55586 0
29-Sep-15 1102 55586 8
1-Oct-13 1102 55557 null
15-Oct-13 1102 55557 0
15-Oct-13 1102 55557 0
*4-Apr-14 1102 55557 5*
19-Dec-14 1102 55557 8
Any help here would be greatly appreciated!!!
Recursive solution:
with tmp as (
select fromdate fd, lectureid lid, studentid sid, null mb,
row_number() over (partition by lectureid, studentid order by fromdate) rn
from above_table ),
cte (fd, ld, lid, sid, mb, rn) as (
select fd, fd, lid, sid, mb, rn from tmp where rn=1
union all
select tmp.fd, case when floor(months_between(tmp.fd, cte.ld)) >= 6
then tmp.fd else cte.ld end,
tmp.lid, tmp.sid, floor(months_between(tmp.fd, cte.ld)), tmp.rn
from tmp join cte on tmp.lid = cte.lid and tmp.sid = cte.sid and tmp.rn = cte.rn+1)
select to_char(fd, 'yyyy-mm-dd') fromdate, lid lecture, sid student, mb
from cte order by sid desc, fd
Test data and output:
create table above_table (Fromdate date, LectureiD number(6), StudentID number(6), Diff number(4));
insert into above_table values (date '2013-10-01', 1102, 55586, null);
insert into above_table values (date '2013-10-15', 1102, 55586, 0);
insert into above_table values (date '2013-10-15', 1102, 55586, 0);
insert into above_table values (date '2014-04-04', 1102, 55586, 6);
insert into above_table values (date '2014-12-19', 1102, 55586, 8);
insert into above_table values (date '2014-12-27', 1102, 55586, 0);
insert into above_table values (date '2015-01-14', 1102, 55586, 0);
insert into above_table values (date '2015-01-14', 1102, 55586, 0);
insert into above_table values (date '2015-09-29', 1102, 55586, 8);
insert into above_table values (date '2013-10-01', 1102, 55557, null);
insert into above_table values (date '2013-10-15', 1102, 55557, 0);
insert into above_table values (date '2013-10-15', 1102, 55557, 0);
insert into above_table values (date '2013-10-29', 1102, 55557, 0);
insert into above_table values (date '2014-04-04', 1102, 55557, 6);
insert into above_table values (date '2014-12-19', 1102, 55557, 8);
FROMDATE LECTURE STUDENT MB
---------- ---------- ---------- ----------
2013-10-01 1102 55586
2013-10-15 1102 55586 0
2013-10-15 1102 55586 0
2014-04-04 1102 55586 6
2014-12-19 1102 55586 8
2014-12-27 1102 55586 0
2015-01-14 1102 55586 0
2015-01-14 1102 55586 0
2015-09-29 1102 55586 9
2013-10-01 1102 55557
2013-10-15 1102 55557 0
2013-10-15 1102 55557 0
2013-10-29 1102 55557 0
2014-04-04 1102 55557 6
2014-12-19 1102 55557 8
Explanation:
Subquery tmp only enumerates rows separately for each lecture and students:
select fromdate fd, lectureid lid, studentid sid, null mb, row_number() over (partition by lectureid, studentid order by fromdate) rn
This line is the "anchor" of recursive subquery CTE, where we take two rows with numbered as 1 in first step
select fd, fd, lid, sid, mb, rn from tmp where rn=1
In this step I am attaching "recursive member" using condition tmp.rn = cte.rn+1 Specially important is part below, here I am checking if there are six months between last remembered date and date from current row:
case when floor(months_between(tmp.fd, cte.ld)) >= 6 then tmp.fd else cte.ld end
Last select is required part of syntax.
Some useful links:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With