Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Incrementally comparing multiple value sets or lists between two tables in Oracle

Tags:

sql

oracle11g

I am trying to compare two sets of values between 2 Oracle tables as below. I am trying to look for and match groups of data in table B with those in table A. The group number is common between tables

Its considered a match only if all groups and values under an id in Table A are equal the group and value pair in Table B. I have highlighted the 'matches' in green. Table A could have variable number of group/value pairs under ida value. There could be ids that have only one group/value pair and there could be some that have 3 group/value pairs

Comparison Example

  Ida GroupA Vala|GroupB Valb| Match?
  ------------------------------------------------------------------------
  50    1     4  |  1     1  | No - Value doesn't match
  56    1     5  |  1     1  | No - Value doesn't match
  57    1     1  |  1     1  | Yes - Both Groups (1&2) and Values match
  57    2    101 |  2    101 | Yes - Both Group (1&2)and Values match
  94    1     1  |  1     1  | Yes - Group and Value match
  96    1     1  |  1     1  | No - Only group 1 matches
  96    2    102 |  2    101 | No - Only group 1 matches. Group 2 doesn't

Trial (and Error!)

I figured I would have to use some sort of count and tried using a partition by to count the groups in Table A. But, I am not sure how to use this in a query to do a sequential/multi value comparison. I looked up hierarchical functions but realized they may not fit here.. What would be the best approach to deal with such data comparison? Thanks for your help..

Happy Halloween! :)

select a.*,MAX(a.groupa) OVER (PARTITION BY a.ida ORDER BY a.groupa desc) 
occurs 
from tab_a a, tab_b b
where a.groupa=b.groupb and a.vala=b.valb
and a.groupa<=3

Tables

Tables A and B

create table tab_a   
(                    
ida number,         
groupa number, 
vala number
)

create table tab_b
(   
idb number, 
groupb number, 
valb number
)

Data

insert into tab_a values (50,1,4);
insert into tab_a values (56,1,5);
insert into tab_a values (57,1,1);
insert into tab_a values (57,2,101);
insert into tab_a values (58,1,1);
insert into tab_a values (58,2,104);
insert into tab_a values (60,2,102);
insert into tab_a values (94,1,1);
insert into tab_a values (95,1,1);
insert into tab_a values (95,2,101);
insert into tab_a values (96,1,1);
insert into tab_a values (96,2,102);
insert into tab_a values (97,1,1);
insert into tab_a values (97,2,101);
insert into tab_a values (97,3,201);

insert into tab_b values (752,1,1);
insert into tab_b values (752,2,101);
insert into tab_b values (752,3,201);
like image 725
Sofia Avatar asked Mar 10 '26 01:03

Sofia


2 Answers

I don't think this is all the way there but might get you started. You can do:

select a.*, b.*,
  count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
    over (partition by a.ida) match_count,
  count(distinct a.groupa||':'||a.vala)
    over (partition by a.ida) val_count
from tab_a a
full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
where a.groupa <= 3;

The distinct may not be needed, and the concatenation with the colon needs to use a characters that isn't in any real value, I suppose, to avoid potential for false matched.

That gets:

IDA GROUPA VALA  IDB GROUPB VALB MATCH_COUNT  VAL_COUNT
--- ------ ---- ---- ------ ---- ----------- ----------
 50      1    4                            0          1
 56      1    5                            0          1
 57      1    1  752      1    1           2          2
 57      2  101  752      2  101           2          2
 58      1    1  752      1    1           1          2
 58      2  104                            1          2
 60      2  102                            0          1
 94      1    1  752      1    1           1          1
 95      1    1  752      1    1           2          2
 95      2  101  752      2  101           2          2
 96      1    1  752      1    1           1          2
 96      2  102                            1          2
 97      1    1  752      1    1           3          3
 97      2  101  752      2  101           3          3
 97      3  201  752      3  201           3          3

And then use that as a CTE or inline view and decode the results:

with t as (
  select a.ida, a.groupa, a.vala, b.groupb, b.valb,
    count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
      over (partition by a.ida) match_count,
    count(distinct a.groupa||':'||a.vala)
      over (partition by a.ida) val_count
  from tab_a a
  full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
  where a.groupa <= 3
)
select ida, groupa, vala, groupb, valb,
  case
    when match_count = 0 then 'No - Value doesn''t match'
    when match_count = val_count and val_count = 1
      then 'Yes - Group and Value match'
    when match_count = val_count and val_count = 2
      then 'Yes - Both Group (1&2) and Values match'
    when match_count < val_count and val_count = 2 and valb is not null
      then 'No - Only group 1 matches'
    when match_count < val_count and val_count = 2 and valb is null
      then 'No - Only group 1 matches. Group 2 doesn''t'
    else 'Unknown scenario?'
  end as "Match?"
from t;

Which gets:

IDA GROUPA VALA GROUPB VALB Match?                                    
--- ------ ---- ------ ---- ------------------------------------------
 50      1    4             No - Value doesn't match                  
 56      1    5             No - Value doesn't match                  
 57      1    1      1    1 Yes - Both Group (1&2) and Values match   
 57      2  101      2  101 Yes - Both Group (1&2) and Values match   
 58      1    1      1    1 No - Only group 1 matches                 
 58      2  104             No - Only group 1 matches. Group 2 doesn't
 60      2  102             No - Value doesn't match                  
 94      1    1      1    1 Yes - Group and Value match               
 95      1    1      1    1 Yes - Both Group (1&2) and Values match   
 95      2  101      2  101 Yes - Both Group (1&2) and Values match   
 96      1    1      1    1 No - Only group 1 matches                 
 96      2  102             No - Only group 1 matches. Group 2 doesn't
 97      1    1      1    1 Yes - All Group (1&2&3) and Values match  
 97      2  101      2  101 Yes - All Group (1&2&3) and Values match  
 97      3  201      3  201 Yes - All Group (1&2&3) and Values match  

I think that gets the match result you showed in your examples; not sure if the others you didn't show are what you want... ID 97 matches on three groups/values, and it's easy enough to do:

    when match_count = val_count and val_count = 3
      then 'Yes - All Group (1&2&3) and Values match'

for that exact match, but figuring out what to show if one or two of those three match is trickier. You could also capture the min and max B values that do match and work out from those which one(s) are missing; but then you might add a fourth group, and it doesn't scale.

like image 147
Alex Poole Avatar answered Mar 12 '26 15:03

Alex Poole


This query should work:

select a.ida
from tab_a a
where a.groupa||a.vala in 
  (select b.groupb|| b.valb from tab_b b where b.groupb = a.groupa )
group by a.ida
having count(distinct a.groupa||a.vala) = 
  (select count(distinct a1.groupa||a1.vala) 
    from tab_a a1 
    where a1.ida = a.ida)

Bit of explanation:

1. where clause gets all the rows from tab_a 
     that exist in tab_b for a group+val combo. 
     - So let's say there are 2 (out of 2) rows in tab_a 
       that match with 2(out of 3) rows in tab_b.

2. left hand side of the having clause adds 
      a condition to the found rows such that 
      total number of rows of distinct group+val must equal to
      - So here we start comparing that count 2

3. right hand side of the having clause 
      that provides the total number of 
      distinct group+val (regardless of any match with tab_b).
      - here we enforce that left hand side must be equal
        to the total number of rows found. So if in #2 above, 
        only 1 row of table_a matched (out of its 2 rows), 
        then #3 will exclude that set.
like image 40
donlys Avatar answered Mar 12 '26 14:03

donlys



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!