I have duplicate rows in my table, how can I delete them based on a single column's value?
Eg
uniqueid, col2, col3 ...
1, john, simpson
2, sally, roberts
1, johnny, simpson
delete any duplicate uniqueIds
to get
1, John, Simpson
2, Sally, Roberts
You probably have a row id that is assigned by the DB upon insertion and is actually unique. I'll call this rowId in my example.
rowId |uniqueid |col2 |col3
----- |-------- |---- |----
1 10 john simpson
2 20 sally roberts
3 10 johnny simpson
You can remove duplicates by grouping on the thing that is supposed to be unique (whether it be one column or many), then you grab a rowId from each group, and delete everything else besides those rowIds. In the inner query, everything in the table will have a rowId except for the duplicate rows.
select *
--DELETE
FROM MyTable
WHERE rowId NOT IN
(SELECT MIN(rowId)
FROM MyTable
GROUP BY uniqueid);
You could also use MAX instead of MIN with similar results.
DECLARE @du TABLE (
id INT,
Name VARCHAR(4)
)
INSERT INTO @du VALUES(1,'john')
INSERT INTO @du VALUES(2,'jane')
INSERT INTO @du VALUES(1,'john')
;WITH dup (id,dp)
AS
(SELECT id
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY Name) AS dp
FROM @du)
DELETE FROM dup
WHERE dp > 1
SELECT *
FROM @du
You can DELETE
from a cte:
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank > 1
The ROW_NUMBER()
function assigns a number to each row. PARTITION BY
is used to start the numbering over for each item in that group, in this case each value of uniqueid
will start numbering at 1 and go up from there. ORDER BY
determines which order the numbers go in. Since each uniqueid
gets numbered starting at 1, any record with a ROW_NUMBER()
greater than 1 has a duplicate uniqueid
To get an understanding of how the ROW_NUMBER()
function works, just try it out:
SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid
You can adjust the logic of the ROW_NUMBER()
function to adjust which record you'll keep or remove.
For instance, perhaps you'd like to do this in multiple steps, first deleting records with the same last name but different first names, you could add last name to the PARTITION BY
:
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
FROM Table)
DELETE FROM cte
WHERE RowRank > 1
Here is simple magic to remove duplicates
select * into NewTable from ExistingTable
union
select * from ExistingTable;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With