I am trying to compare two sets of values between 2 Oracle tables as below. I am trying to look for and match groups of data in table B with those in table A. The group number is common between tables
Its considered a match only if all groups and values under an id in Table A are equal the group and value pair in Table B. I have highlighted the 'matches' in green. Table A could have variable number of group/value pairs under ida value. There could be ids that have only one group/value pair and there could be some that have 3 group/value pairs
Ida GroupA Vala|GroupB Valb| Match?
------------------------------------------------------------------------
50 1 4 | 1 1 | No - Value doesn't match
56 1 5 | 1 1 | No - Value doesn't match
57 1 1 | 1 1 | Yes - Both Groups (1&2) and Values match
57 2 101 | 2 101 | Yes - Both Group (1&2)and Values match
94 1 1 | 1 1 | Yes - Group and Value match
96 1 1 | 1 1 | No - Only group 1 matches
96 2 102 | 2 101 | No - Only group 1 matches. Group 2 doesn't
I figured I would have to use some sort of count and tried using a partition by to count the groups in Table A. But, I am not sure how to use this in a query to do a sequential/multi value comparison. I looked up hierarchical functions but realized they may not fit here.. What would be the best approach to deal with such data comparison? Thanks for your help..
Happy Halloween! :)
select a.*,MAX(a.groupa) OVER (PARTITION BY a.ida ORDER BY a.groupa desc)
occurs
from tab_a a, tab_b b
where a.groupa=b.groupb and a.vala=b.valb
and a.groupa<=3
Tables A and B
create table tab_a
(
ida number,
groupa number,
vala number
)
create table tab_b
(
idb number,
groupb number,
valb number
)
insert into tab_a values (50,1,4);
insert into tab_a values (56,1,5);
insert into tab_a values (57,1,1);
insert into tab_a values (57,2,101);
insert into tab_a values (58,1,1);
insert into tab_a values (58,2,104);
insert into tab_a values (60,2,102);
insert into tab_a values (94,1,1);
insert into tab_a values (95,1,1);
insert into tab_a values (95,2,101);
insert into tab_a values (96,1,1);
insert into tab_a values (96,2,102);
insert into tab_a values (97,1,1);
insert into tab_a values (97,2,101);
insert into tab_a values (97,3,201);
insert into tab_b values (752,1,1);
insert into tab_b values (752,2,101);
insert into tab_b values (752,3,201);
I don't think this is all the way there but might get you started. You can do:
select a.*, b.*,
count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
over (partition by a.ida) match_count,
count(distinct a.groupa||':'||a.vala)
over (partition by a.ida) val_count
from tab_a a
full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
where a.groupa <= 3;
The distinct may not be needed, and the concatenation with the colon needs to use a characters that isn't in any real value, I suppose, to avoid potential for false matched.
That gets:
IDA GROUPA VALA IDB GROUPB VALB MATCH_COUNT VAL_COUNT
--- ------ ---- ---- ------ ---- ----------- ----------
50 1 4 0 1
56 1 5 0 1
57 1 1 752 1 1 2 2
57 2 101 752 2 101 2 2
58 1 1 752 1 1 1 2
58 2 104 1 2
60 2 102 0 1
94 1 1 752 1 1 1 1
95 1 1 752 1 1 2 2
95 2 101 752 2 101 2 2
96 1 1 752 1 1 1 2
96 2 102 1 2
97 1 1 752 1 1 3 3
97 2 101 752 2 101 3 3
97 3 201 752 3 201 3 3
And then use that as a CTE or inline view and decode the results:
with t as (
select a.ida, a.groupa, a.vala, b.groupb, b.valb,
count(case when a.groupa = b.groupb and a.vala = b.valb then a.ida end)
over (partition by a.ida) match_count,
count(distinct a.groupa||':'||a.vala)
over (partition by a.ida) val_count
from tab_a a
full outer join tab_b b on b.groupb = a.groupa and b.valb = a.vala
where a.groupa <= 3
)
select ida, groupa, vala, groupb, valb,
case
when match_count = 0 then 'No - Value doesn''t match'
when match_count = val_count and val_count = 1
then 'Yes - Group and Value match'
when match_count = val_count and val_count = 2
then 'Yes - Both Group (1&2) and Values match'
when match_count < val_count and val_count = 2 and valb is not null
then 'No - Only group 1 matches'
when match_count < val_count and val_count = 2 and valb is null
then 'No - Only group 1 matches. Group 2 doesn''t'
else 'Unknown scenario?'
end as "Match?"
from t;
Which gets:
IDA GROUPA VALA GROUPB VALB Match?
--- ------ ---- ------ ---- ------------------------------------------
50 1 4 No - Value doesn't match
56 1 5 No - Value doesn't match
57 1 1 1 1 Yes - Both Group (1&2) and Values match
57 2 101 2 101 Yes - Both Group (1&2) and Values match
58 1 1 1 1 No - Only group 1 matches
58 2 104 No - Only group 1 matches. Group 2 doesn't
60 2 102 No - Value doesn't match
94 1 1 1 1 Yes - Group and Value match
95 1 1 1 1 Yes - Both Group (1&2) and Values match
95 2 101 2 101 Yes - Both Group (1&2) and Values match
96 1 1 1 1 No - Only group 1 matches
96 2 102 No - Only group 1 matches. Group 2 doesn't
97 1 1 1 1 Yes - All Group (1&2&3) and Values match
97 2 101 2 101 Yes - All Group (1&2&3) and Values match
97 3 201 3 201 Yes - All Group (1&2&3) and Values match
I think that gets the match result you showed in your examples; not sure if the others you didn't show are what you want... ID 97 matches on three groups/values, and it's easy enough to do:
when match_count = val_count and val_count = 3
then 'Yes - All Group (1&2&3) and Values match'
for that exact match, but figuring out what to show if one or two of those three match is trickier. You could also capture the min and max B values that do match and work out from those which one(s) are missing; but then you might add a fourth group, and it doesn't scale.
This query should work:
select a.ida
from tab_a a
where a.groupa||a.vala in
(select b.groupb|| b.valb from tab_b b where b.groupb = a.groupa )
group by a.ida
having count(distinct a.groupa||a.vala) =
(select count(distinct a1.groupa||a1.vala)
from tab_a a1
where a1.ida = a.ida)
Bit of explanation:
1. where clause gets all the rows from tab_a that exist in tab_b for a group+val combo. - So let's say there are 2 (out of 2) rows in tab_a that match with 2(out of 3) rows in tab_b. 2. left hand side of the having clause adds a condition to the found rows such that total number of rows of distinct group+val must equal to - So here we start comparing that count 2 3. right hand side of the having clause that provides the total number of distinct group+val (regardless of any match with tab_b). - here we enforce that left hand side must be equal to the total number of rows found. So if in #2 above, only 1 row of table_a matched (out of its 2 rows), then #3 will exclude that set.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With