Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete duplicate records in SQL Server?

Consider a column named EmployeeName table Employee. The goal is to delete repeated records, based on the EmployeeName field.

EmployeeName
------------
Anand
Anand
Anil
Dipak
Anil
Dipak
Dipak
Anil

Using one query, I want to delete the records which are repeated.

How can this be done with TSQL in SQL Server?

like image 366
usr021986 Avatar asked Jul 23 '10 10:07

usr021986


People also ask

How do you delete duplicate records in SQL Server?

To delete the duplicate rows from the table in SQL Server, you follow these steps: Find duplicate rows using GROUP BY clause or ROW_NUMBER() function. Use DELETE statement to remove the duplicate rows.


3 Answers

You can do this with window functions. It will order the dupes by empId, and delete all but the first one.

delete x from (   select *, rn=row_number() over (partition by EmployeeName order by empId)   from Employee  ) x where rn > 1; 

Run it as a select to see what would be deleted:

select * from (   select *, rn=row_number() over (partition by EmployeeName order by empId)   from Employee  ) x where rn > 1; 
like image 56
John Gibb Avatar answered Sep 23 '22 08:09

John Gibb


Assuming that your Employee table also has a unique column (ID in the example below), the following will work:

delete from Employee  where ID not in (     select min(ID)     from Employee      group by EmployeeName  ); 

This will leave the version with the lowest ID in the table.

Edit
Re McGyver's comment - as of SQL 2012

MIN can be used with numeric, char, varchar, uniqueidentifier, or datetime columns, but not with bit columns

For 2008 R2 and earlier,

MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns (and it also doesn't work with GUID's)

For 2008R2 you'll need to cast the GUID to a type supported by MIN, e.g.

delete from GuidEmployees where CAST(ID AS binary(16)) not in (     select min(CAST(ID AS binary(16)))     from GuidEmployees     group by EmployeeName  ); 

SqlFiddle for various types in Sql 2008

SqlFiddle for various types in Sql 2012

like image 41
StuartLC Avatar answered Sep 23 '22 08:09

StuartLC


You could try something like the following:

delete T1
from MyTable T1, MyTable T2
where T1.dupField = T2.dupField
and T1.uniqueField > T2.uniqueField  

(this assumes that you have an integer based unique field)

Personally though I'd say you were better off trying to correct the fact that duplicate entries are being added to the database before it occurs rather than as a post fix-it operation.

like image 44
Ben Cawley Avatar answered Sep 25 '22 08:09

Ben Cawley