I have a MySQL
table like:
ID, Col1, Col2, Col3, Col4, etc...
ID is a primary key
and has been working since the table's creation.
What I want to do is delete all but one records where all the other columns are identical.
Select the range you want to remove duplicate rows. If you want to delete all duplicate rows in the worksheet, just hold down Ctrl + A key to select the entire sheet. 2. On Data tab, click Remove Duplicates in the Data Tools group.
The DISTINCT keyword eliminates duplicate rows from a result.
You can use DELETE command with some condition for this since we need to keep one record and delete rest of the duplicate records. The above query deleted 2 rows for “Carol” and left one of the “Carol” record.
DELETE DupRows.*
FROM MyTable AS DupRows
INNER JOIN (
SELECT MIN(ID) AS minId, col1, col2
FROM MyTable
GROUP BY col1, col2
HAVING COUNT(*) > 1
) AS SaveRows ON SaveRows.col1 = DupRows.col1 AND SaveRows.col2 = DupRows.col2
AND SaveRows.minId <> DupRows.ID;
Of course you have to extend col1, col2 in all three places to all columns.
Edit: I just pulled this out of a script I keep and re-tested, it executes in MySQL.
RENAME TABLE [table w/ duplicates] TO [temporary table name]
Create an identical table with the original table name which contained the duplicates.
INSERT INTO [new table] SELECT DISTINCT * FROM [old table with duplicates]
Delete the temporary tables.
Without nested selects or temporary tables.
DELETE t1
FROM table_name t1, table_name t2
WHERE
(t1.Col1 = t2.Col1 OR t1.Col1 IS NULL AND t2.Col1 IS NULL)
AND (t1.Col2 = t2.Col2 OR t1.Col2 IS NULL AND t2.Col2 IS NULL)
AND (t1.Col3 = t2.Col3 OR t1.Col3 IS NULL AND t2.Col3 IS NULL)
AND (t1.Col4 = t2.Col4 OR t1.Col4 IS NULL AND t2.Col4 IS NULL)
...
AND t1.ID < t2.ID;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With