I have the below table and now I need to delete the rows which are having duplicate "refIDs" but have atleast one row with that ref, i.e i need to remove row 4 and 5. please help me on this
+----+-------+--------+--+
| ID | refID | data | |
+----+-------+--------+--+
| 1 | 1023 | aaaaaa | |
| 2 | 1024 | bbbbbb | |
| 3 | 1025 | cccccc | |
| 4 | 1023 | ffffff | |
| 5 | 1023 | gggggg | |
| 6 | 1022 | rrrrrr | |
+----+-------+--------+--+
Introduction to SQL DISTINCT operator Note that the DISTINCT only removes the duplicate rows from the result set. It doesn't delete duplicate rows in the table. If you want to select two columns and remove duplicates in one column, you should use the GROUP BY clause instead.
Right-click on any of the selected cells and click on 'Delete Row' In the dialog box that opens, click on OK.
Select the entire dataset, along with the column headers. From the Data tab, under the Data Tools group select the Remove Duplicates button.
This is similar to Gordon Linoff's query, but without the subquery:
DELETE t1 FROM table t1
JOIN table t2
ON t2.refID = t1.refID
AND t2.ID < t1.ID
This uses an inner join to only delete rows where there is another row with the same refID but lower ID.
The benefit of avoiding a subquery is being able to utilize an index for the search. This query should perform well with a multi-column index on refID + ID.
I would do:
delete from t where
ID not in (select min(ID) from table t group by refID having count(*) > 1)
and refID in (select refID from table t group by refID having count(*) > 1)
criteria is refId is among the duplicates and ID is different from the min(id) from the duplicates. It would work better if refId is indexed
otherwise and provided you can issue multiple times the following query until it does not delete anything
delete from t
where
ID in (select max(ID) from table t group by refID having count(*) > 1)
Some another variant, in some cases a bit faster than Marcus and NJ73 answers:
DELETE ourTable
FROM ourTable JOIN
(SELECT ID,targetField
FROM ourTable
GROUP BY targetField HAVING COUNT(*) > 1) t2
ON ourTable.targetField = t2.targetField AND ourTable.ID != t2.ID;
Hope that will help someone. On big tables Marcus answer stalls.
In MySQL, you can do this with a join
in delete
:
delete t
from table t left join
(select min(id) as id
from table t
group by refId
) tokeep
on t.id = tokeep.id
where tokeep.id is null;
For each RefId
, the subquery calculates the minimum of the id
column (presumed to be unique over the whole table). It uses a left join
for the match, so anything that doesn't match has a NULL
value for tokeep.id
. These are the ones that are deleted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With