Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sql join using recursive cte

Edit: Added another case scenario in the notes and updated the sample attachment.

I am trying to write a sql to get an output attached with this question along with sample data. There are two table, one with distinct ID's (pk) with their current flag. another with Active ID (fk to the pk from the first table) and Inactive ID (fk to the pk from the first table) Final output should return two columns, first column consist of all distinct ID's from the first table and second column should contain Active ID from the 2nd table. Below is the sql:

IF OBJECT_ID('tempdb..#main') IS NOT NULL DROP TABLE #main;
IF OBJECT_ID('tempdb..#merges') IS NOT NULL DROP TABLE #merges
IF OBJECT_ID('tempdb..#final') IS NOT NULL DROP TABLE #final

SELECT DISTINCT id, 
       current
INTO   #main 
FROM   tb_ID t1 


--get list of all active_id and inactive_id 

SELECT DISTINCT active_id, 
            inactive_id, 
            Update_dt  
INTO   #merges 
FROM   tb_merges 
-- Combine where the id from the main table matched to the inactive_id (should return all the rows from #main)

   SELECT id, 
   active_id AS merged_to_id 
   INTO   #final 
   FROM   (SELECT t1.*, 
           t2.active_id, 
           Update_dt , 
           Row_number() 
             OVER ( 
               partition BY id, active_id 
               ORDER BY Update_dt DESC) AS rn 
    FROM   #main t1 
           LEFT JOIN #merges t2 
                  ON t1.id = t2.inactive_id) t3 
  WHERE  rn = 1 

 SELECT *
 FROM #final

This sql partially works. It doesn't work, where the id was once active then gets inactive. Please note:

  • the active ID should return the last most active ID
  • the ID which doesn't have any active ID should either be null or the ID itself
  • ID where the current = 0, in those cases active ID should be the ID current in tb_ID

  • ID's may get interchanged. For example there are two ID's 6 and 7, when 6 is active 7 is inactive and vice versa. the only way to know the most current active state is by the update date

Attached sample might be easy to understand

Sample data with expected output

Looks like I might have to use recursive cte for achieiving the results. Can someone please help? thank you for your time!

like image 963
Excited_to_learn Avatar asked Nov 08 '22 06:11

Excited_to_learn


1 Answers

I think you're correct that a recursive CTE looks like a good solution for this. I'm not entirely certain that I've understood exactly what you're asking for, particularly with regard to the update_dt column, just because the data is a little abstract as-is, but I've taken a stab at it, and it does seem to work with your sample data. The comments explain what's going on.

declare @tb_id table (id bigint, [current] bit);
declare @tb_merges table (active_id bigint, inactive_id bigint, update_dt datetime2);
insert @tb_id values
    -- Sample data from the question.
    (1, 1),
    (2, 1),
    (3, 1),
    (4, 1),
    (5, 0),
    -- A few additional data to illustrate a deeper search.
    (6, 1),
    (7, 1),
    (8, 1),
    (9, 1),
    (10, 1);

insert @tb_merges values
    -- Sample data from the question.
    (3, 1, '2017-01-11T13:09:00'),
    (1, 2, '2017-01-11T13:07:00'),
    (5, 4, '2013-12-31T14:37:00'),
    (4, 5, '2013-01-18T15:43:00'),
    -- A few additional data to illustrate a deeper search.
    (6, 7, getdate()),
    (7, 8, getdate()),
    (8, 9, getdate()),
    (9, 10, getdate());

if object_id('tempdb..#ValidMerge') is not null
    drop table #ValidMerge;

-- Get the subset of merge records whose active_id identifies a "current" id and
-- rank by date so we can consider only the latest merge record for each active_id.
with ValidMergeCTE as
(
    select
        M.active_id,
        M.inactive_id,
        [Priority] = row_number() over (partition by M.active_id order by M.update_dt desc)
    from 
        @tb_merges M
        inner join @tb_id I on M.active_id = I.id 
    where
        I.[current] = 1
)
select
    active_id,
    inactive_id
into
    #ValidMerge
from
    ValidMergeCTE
where
    [Priority] = 1;

 -- Here's the recursive CTE, which draws on the subset of merges identified above.
 with SearchCTE as
 (
    -- Base case: any record whose active_id is not used as an inactive_id is an endpoint.
    select
        M.active_id,
        M.inactive_id,
        Depth = 0
    from
        #ValidMerge M
    where
        not exists (select 1 from #ValidMerge M2 where M.active_id = M2.inactive_id)

    -- Recursive case: look for records whose active_id matches the inactive_id of a previously
    --                 identified record.
    union all
    select
        S.active_id,
        M.inactive_id,
        Depth = S.Depth + 1
    from
        #ValidMerge M
        inner join SearchCTE S on M.active_id = S.inactive_id
 )
 select
    I.id,
    S.active_id
 from
    @tb_id I
    left join SearchCTE S on I.id = S.inactive_id;

Results:

id      active_id
------------------
1       3
2       3
3       NULL
4       NULL
5       4
6       NULL
7       6
8       6
9       6
10      6
like image 196
Joe Farrell Avatar answered Nov 15 '22 07:11

Joe Farrell