Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL find identical group

Given a table like:

id    key   val
----  ----  -----
bob   hair  red
bob   eyes  green

And another table like:

id    key   val
----  ----  -----
fred  hair  red
fred  eyes  green
fred  shoe  42
joe   hair  red 
joe   eyes  green
greg  eyes  blue
greg  hair  brown

I'd like to find people in table b who match people in table a exactly, in this case Bob and Joe. Fred doesn't count because he also has a shoe size. This is in Sybase so there's no full outer join. I've come up with a select of a select with a union that returns people who definitely aren't the same, but I'm not sure how to efficiently select people who are.

Alternatively, if it's simpler, how can I check which groups in a occur in b more than once?

like image 946
chpatrick Avatar asked Aug 01 '11 09:08

chpatrick


People also ask

What is identical in SQL?

Duplicate records in SQL, also known as duplicate rows, are identical rows in an SQL table. This means, for a pair of duplicate records, the values in each column coincide. Usually, we will retrieve duplicate data, when we are joining tables.

Does GROUP BY have duplicates?

GROUP BY only treats two rows as duplicates if all the column values in both the rows are the same. If even a single column value in either of the row is non-matching, they are treated as unique.

How do I find duplicate records in SQL Server without GROUP BY?

1. Using the Distinct Keyword to eliminate duplicate values and count their occurences from the Query results. We can use the Distinct keyword to fetch the unique records from our database. This way we can view the unique results from our database.


1 Answers

Try this

select a.id,b.id
from a 
join b on a.[key] = b.[key] and a.val = b.val -- match all rows
join (select id,count(*) total from a group by id) a2 on a.id = a2.id -- get the total keys for table a per id
join (select id,count(*) total from b group by id) b2 on b.id = b2.id -- get the total keys for table b per id
group by a.id,b.id,a2.total,b2.total
having count(*) = a2.total AND count(*) = b2.total -- the matching row's total should be equal with each tables keys per id

After @t-clausen.dk comments I made a revision of the original sql code. In this case i count each distinct pair/value that matches on both tables, with each tables distinct pair/value.

select td.aid,td.bid
from (
select a.id as aid,b.id as bid, count(distinct a.[key]+' '+a.val) total
from a 
join b on a.[kry] = b.[key] and a.val = b.val
group by a.id,b.id
) td -- match all distinct attribute rows
join (select id,count(distinct [key]+' '+val) total from a group by id) a2 on td.aid = a2.id -- get the total distinct keys for table a per id
join (select id,count(distinct [key]+' '+val) total from b group by id) b2 on td.bid = b2.id -- get the total keys for table b per id
where td.total = a2.total AND td.total = b2.total -- the matching distinct attribute total should be equal with each tables distinct key-val pair

Tested on

Table a

bob     hair    red
bob     eyes    green
nick    hair    red
nick    eyes    green
nick    shoe    45

Table b

fred    hair    red
fred    eyes    green
joe     hair    red
joe     eyes    green
fred    shoe    42
like image 118
niktrs Avatar answered Oct 12 '22 07:10

niktrs