Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove reverse duplicates from an SQL query

Tags:

sql

Let's say that a query result is supposed to return a list of string pairs (x, y). I am trying to eliminate the reverse duplicates. What I mean is, if (x, y) was one of the results, (y, x) should not appear later.

example:

column 1 (foreign key)    column 2 (int)     column 3 (string)
4                         50                 Bob
2                         70                 Steve 
3                         50                 Joe

The people represented in this table can appear multiple times with a different column 2 value.

My query needs to print every pair of names that have the same column 2 value :

select e.column3, f.column3 from example as e, example as f where e.column2 = f.column2 

(Bob, Bob)
(Bob, Joe)
(Joe, Bob)
(Joe, Joe)

I upgraded the query so that it removes the doubles:

select e.column3, f.column3 from example as e, example as f where e.column2 = f.column2
       and e.column3 <> f.column3

(Bob, Joe)
(Joe, Bob)

Now I want it to only return:

(Bob, Joe). 

(Joe, Bob) is a reverse duplicate, so I don't want it in the result. Is there anyway to handle that in one query?

like image 868
Gregory-Turtle Avatar asked Oct 24 '12 01:10

Gregory-Turtle


People also ask

How do I remove duplicates from SQL query results?

The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.

How do you delete duplicate rows in SQL using RowID?

Deleting Multiple Duplicates. Select the RowID you want to delete. After "SQL," enter "select rowid, name from names;." Delete the duplicate.

Does SQL remove duplicates?

We can use the SQL RANK function to remove the duplicate rows as well. SQL RANK function gives unique row ID for each row irrespective of the duplicate row. In the following query, we use a RANK function with the PARTITION BY clause.


2 Answers

First of all, welcome to 2012. We have migrated away from relating tables using commas. It was introdued in ANSI 89 but is severely lacking. Nowaways, the correct way is to write queries using the ANSI 92/99/2003 JOIN syntax.

The solution to your problem is to turn your bidirectional inequality <> into a unidirectional inequality, either < or > whichever you prefer.

select e.column3, f.column3
from example as e
join example as f on e.column2 = f.column2 and e.column3 < f.column3
like image 110
RichardTheKiwi Avatar answered Oct 05 '22 01:10

RichardTheKiwi


select e.column3, f.column3 from example as e, example as f where e.column2 = f.column2
       and e.column3 <> f.column3 where e.id < f.id

adding a simple where clause should do it.

like image 21
xQbert Avatar answered Oct 05 '22 03:10

xQbert