Delete where one column contains duplicates

Question

consider the below:

ProductID  Supplier
---------  --------
111        Microsoft
112        Microsoft
222        Apple Mac
222        Apple
223        Apple

In this example product 222 is repeated because the supplier is known as two names in the data supplied.

I have data like this for thousands of products. How can I delete the duplicate products or select individual results - something like a self join with SELECT TOP 1 or something like that?

Thanks!

Gordon Linoff · Accepted Answer

I think you want to do the following:

select t.*
from (select t.*,
             row_number() over (partition by product_id order by (select NULL)) as seqnum
      from t
     ) t
where seqnum = 1

This selects an arbitrary row for each product.

To delete all rows but one, you can use the same idea:

with todelete (
      (select t.*,
               row_number() over (partition by product_id order by (select NULL)) as seqnum
        from t
      )
delete from to_delete where seqnum > 1

John Woo · Answer

DELETE  a
FROM    tableName a
        LEFT JOIN
        (
            SELECT  Supplier, MIN(ProductID) min_ID
            FROM    tableName
            GROUP   BY Supplier
        ) b ON  a.supplier = b.supplier AND
                a.ProductID = b.min_ID
WHERE   b.Supplier IS NULL

SQLFiddle Demo

or if you want to delete productID which has more than onbe product

WITH cte 
AS
(
    SELECT  ProductID, Supplier,
            ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY Supplier) rn
    FROM    tableName
)
DELETE FROM cte WHERE rn > 1

SQLFiddle Demo

Delete where one column contains duplicates

Tags:

sql

sql-server

tsql

sql-delete

Warren

2 Answers

Gordon Linoff

John Woo

Recent Activity

Donate For Us

Delete where one column contains duplicates

Tags:

sql

sql-server

tsql

sql-delete

Warren

2 Answers

Gordon Linoff

John Woo

Related questions

Recent Activity

Donate For Us