Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete duplicate rows from a MySQL table

I have a MySQL table like:

ID, Col1, Col2, Col3, Col4, etc...

ID is a primary key and has been working since the table's creation.

What I want to do is delete all but one records where all the other columns are identical.

like image 550
giggsey Avatar asked Apr 24 '11 11:04

giggsey


People also ask

How can I delete duplicate rows?

Select the range you want to remove duplicate rows. If you want to delete all duplicate rows in the worksheet, just hold down Ctrl + A key to select the entire sheet. 2. On Data tab, click Remove Duplicates in the Data Tools group.

Which is the keyword to eliminate the duplicate rows in MySQL?

The DISTINCT keyword eliminates duplicate rows from a result.

How do I delete duplicate records except one in MySQL?

You can use DELETE command with some condition for this since we need to keep one record and delete rest of the duplicate records. The above query deleted 2 rows for “Carol” and left one of the “Carol” record.


3 Answers

DELETE DupRows.*
FROM MyTable AS DupRows
   INNER JOIN (
      SELECT MIN(ID) AS minId, col1, col2
      FROM MyTable
      GROUP BY col1, col2
      HAVING COUNT(*) > 1
   ) AS SaveRows ON SaveRows.col1 = DupRows.col1 AND SaveRows.col2 = DupRows.col2
      AND SaveRows.minId <> DupRows.ID;

Of course you have to extend col1, col2 in all three places to all columns.

Edit: I just pulled this out of a script I keep and re-tested, it executes in MySQL.

like image 53
Christo Avatar answered Oct 04 '22 18:10

Christo


  1. RENAME TABLE [table w/ duplicates] TO [temporary table name]

  2. Create an identical table with the original table name which contained the duplicates.

  3. INSERT INTO [new table] SELECT DISTINCT * FROM [old table with duplicates]

  4. Delete the temporary tables.

like image 20
user3768826 Avatar answered Oct 04 '22 18:10

user3768826


Without nested selects or temporary tables.

DELETE  t1
FROM    table_name t1, table_name t2
WHERE   
            (t1.Col1 = t2.Col1 OR t1.Col1 IS NULL AND t2.Col1 IS NULL)
        AND (t1.Col2 = t2.Col2 OR t1.Col2 IS NULL AND t2.Col2 IS NULL)
        AND (t1.Col3 = t2.Col3 OR t1.Col3 IS NULL AND t2.Col3 IS NULL)
        AND (t1.Col4 = t2.Col4 OR t1.Col4 IS NULL AND t2.Col4 IS NULL)
        ...
        AND t1.ID < t2.ID;
like image 38
Basilevs Avatar answered Oct 04 '22 18:10

Basilevs