I have a table with 2 fields (name, interest) and I want to find all pairs that have the same interest, with all duplicates and mirrored pairs removed.
I am able to find all pairs and remove duplicates with the following SQL statement:
SELECT P1.name AS name1, P2.name AS name2, P1.interest
FROM Table AS P1, Table AS P2
WHERE P1.interest = P2.interest AND P1.name <> P2.name;
But I am not sure how to remove mirrored pairs, ie:
"wil","ben","databases"
"ben","wil","databases"
I tried to make the above statement a view called Matches and attempted the following query:
SELECT * FROM Matches
WHERE name2 <> (select name1 from Matches);
But it does not remove all mirrored pairs.
Assuming you do not care which pair ends up sticking around (ben,will) vs (will, ben), then my preferred solution is to do the following:
DELETE p2
FROM Pairs p1
INNER JOIN Pairs p2
on p1.Name1 = p2.Name2
and p1.Name2 = p2.Name1
and p1.Interest = p2.Interest
-- match only one of the two pairs
and p1.Name1 > p1.Name2
By virtue of the fact that you would never have Name1 and Name2 equal, there must always be one pair where the first member is less than the second member. Using that relationship, we can delete the duplicate.
This is especially trivial if you have a surrogate key for the relationship, as then the requirement for Name1 and Name2 to be unequal goes away.
Edit: if you don't want to remove them from the table, but just from the results of a specific query, use the same pattern with SELECT
rather than DELETE
.
I had similar problem and figure out studying the first answer that the query below will do the trick
SELECT P1.name AS name1,P2.name AS name2,P1.interest
FROM Table AS P1,Table AS P2
WHERE P1.interest=P2.interest AND P1.name>P2.name
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With