I need to join table A and table B to create table C.
Table A and Table B store status flags for the IDs. The status flags (A_Flag and B_Flag) can change from time to time, so one ID can contain multiple rows, which represents the history of the ID's statuses. The flags for a particular ID can change independently of each other, which can result in one row in Table A belonging to multiple rows in Table B, and vice versa.
The resulting table (Table C) needs to be a list of unique date ranges covering every date within the IDs life (01/01/2008-18/08/2008), and A_Flag and B_Flag values for each date range.
The actual tables contain hundreds of IDs with each ID having a varying numbers of rows per table.
I have access to SQL and SAS tools to achieve the end result.
Source - Table A
ID Start End A_Flag
1 01/01/2008 23/03/2008 1
1 23/03/2008 15/06/2008 0
1 15/06/2008 18/08/2008 1
Source - Table B
ID Start End B_Flag
1 19/01/2008 17/02/2008 1
1 17/02/2008 15/06/2008 0
1 15/06/2008 18/08/2008 1
Result - Table C
ID Start End A_Flag B_Flag
1 01/01/2008 19/01/2008 1 0
1 19/01/2008 17/02/2008 1 1
1 17/02/2008 23/03/2008 1 0
1 23/03/2008 15/06/2008 0 0
1 15/06/2008 18/08/2008 1 1
Subtract the cumulative days from the startdate to get our grp . This is the crux of the solution. If the start date increases at the same rate as the days spanned then the days are consecutive, and subtracting the two will give us the same value.
You can do this by swapping the ranges if necessary up front. Then, you can detect overlap if the second range start is: less than or equal to the first range end (if ranges are inclusive, containing both the start and end times); or. less than (if ranges are inclusive of start and exclusive of end).
Overlapping date a day and month in any year during the deposit period, whose number is the same as the number of the day and month on which the deposit commencement date falls.
I'm going to solve this in SQL, assuming that you have a function called lag
(SQL Server 2012, Oracle, Postgres, DB2). You can get the same effect with a correlated subquery.
The idea is to get all the different time periods. Then join back to the original tables to get the flags.
I am having trouble uploading the code, but can get most of it. However, it starts with start ends, which you create by doing a union
(not union all
) of the four dates in one column: select a.start as thedate. This is then union'ed with a.end, b.start, and b.end.
with driver as (
select thedate as start, lag(thedate) over (order by thedate) as end
from startends
)
select startdate, enddate, a.flag, b.flag
from driver left outer join
a
on a.start >= driver.start and a.end <= driver.end left outer join
b
on b.start >= driver.start and b.end <= driver.end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With