Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete where one column contains duplicates

consider the below:

ProductID  Supplier
---------  --------
111        Microsoft
112        Microsoft
222        Apple Mac
222        Apple
223        Apple

In this example product 222 is repeated because the supplier is known as two names in the data supplied.

I have data like this for thousands of products. How can I delete the duplicate products or select individual results - something like a self join with SELECT TOP 1 or something like that?

Thanks!

like image 763
Warren Avatar asked Jan 15 '23 05:01

Warren


2 Answers

I think you want to do the following:

select t.*
from (select t.*,
             row_number() over (partition by product_id order by (select NULL)) as seqnum
      from t
     ) t
where seqnum = 1

This selects an arbitrary row for each product.

To delete all rows but one, you can use the same idea:

with todelete (
      (select t.*,
               row_number() over (partition by product_id order by (select NULL)) as seqnum
        from t
      )
delete from to_delete where seqnum > 1
like image 149
Gordon Linoff Avatar answered Jan 17 '23 02:01

Gordon Linoff


DELETE  a
FROM    tableName a
        LEFT JOIN
        (
            SELECT  Supplier, MIN(ProductID) min_ID
            FROM    tableName
            GROUP   BY Supplier
        ) b ON  a.supplier = b.supplier AND
                a.ProductID = b.min_ID
WHERE   b.Supplier IS NULL
  • SQLFiddle Demo

or if you want to delete productID which has more than onbe product

WITH cte 
AS
(
    SELECT  ProductID, Supplier,
            ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY Supplier) rn
    FROM    tableName
)
DELETE FROM cte WHERE rn > 1
  • SQLFiddle Demo
like image 36
John Woo Avatar answered Jan 17 '23 03:01

John Woo