Consider a column named <code>EmployeeName</code> table <code>Employee</code>. The goal is to delete repeated records, based on the <code>EmployeeName</code> field. <pre class="prettyprint"><code>EmployeeName ------------ Anand Anand Anil Dipak Anil Dipak Dipak Anil </code></pre> Using one query, I want to delete the records which are repeated. How can this be done with TSQL in SQL Server?

You can do this with window functions. It will order the dupes by empId, and delete all but the first one. <pre class="prettyprint"><code>delete x from ( select *, rn=row_number() over (partition by EmployeeName order by empId) from Employee ) x where rn > 1; </code></pre> Run it as a select to see what would be deleted: <pre class="prettyprint"><code>select * from ( select *, rn=row_number() over (partition by EmployeeName order by empId) from Employee ) x where rn > 1; </code></pre>

Assuming that your Employee table also has a unique column (<code>ID</code> in the example below), the following will work: <pre class="prettyprint"><code>delete from Employee where ID not in ( select min(ID) from Employee group by EmployeeName ); </code></pre> This will leave the version with the lowest ID in the table. Edit Re McGyver's comment - as of SQL 2012 <blockquote> <code>MIN</code> can be used with numeric, char, varchar, uniqueidentifier, or datetime columns, but not with bit columns </blockquote> For 2008 R2 and earlier, <blockquote> MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns (and it also doesn't work with GUID's) </blockquote> For 2008R2 you'll need to cast the <code>GUID</code> to a type supported by <code>MIN</code>, e.g. <pre class="prettyprint"><code>delete from GuidEmployees where CAST(ID AS binary(16)) not in ( select min(CAST(ID AS binary(16))) from GuidEmployees group by EmployeeName ); </code></pre> SqlFiddle for various types in Sql 2008 SqlFiddle for various types in Sql 2012

Delete duplicate records in SQL Server?

Tags:

sql

tsql

duplicates

delete-row

Consider a column named EmployeeName table Employee. The goal is to delete repeated records, based on the EmployeeName field.

EmployeeName
------------
Anand
Anand
Anil
Dipak
Anil
Dipak
Dipak
Anil

Using one query, I want to delete the records which are repeated.

How can this be done with TSQL in SQL Server?

366

asked Jul 23 '10 10:07

usr021986

3 Answers

You can do this with window functions. It will order the dupes by empId, and delete all but the first one.

delete x from (   select *, rn=row_number() over (partition by EmployeeName order by empId)   from Employee  ) x where rn > 1;

Run it as a select to see what would be deleted:

select * from (   select *, rn=row_number() over (partition by EmployeeName order by empId)   from Employee  ) x where rn > 1;

answered Sep 23 '22 08:09

John Gibb

Assuming that your Employee table also has a unique column (ID in the example below), the following will work:

delete from Employee  where ID not in (     select min(ID)     from Employee      group by EmployeeName  );

This will leave the version with the lowest ID in the table.

Edit
Re McGyver's comment - as of SQL 2012

MIN can be used with numeric, char, varchar, uniqueidentifier, or datetime columns, but not with bit columns

For 2008 R2 and earlier,

MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns (and it also doesn't work with GUID's)

For 2008R2 you'll need to cast the GUID to a type supported by MIN, e.g.

delete from GuidEmployees where CAST(ID AS binary(16)) not in (     select min(CAST(ID AS binary(16)))     from GuidEmployees     group by EmployeeName  );

SqlFiddle for various types in Sql 2008

SqlFiddle for various types in Sql 2012

answered Sep 23 '22 08:09

StuartLC

You could try something like the following:

delete T1
from MyTable T1, MyTable T2
where T1.dupField = T2.dupField
and T1.uniqueField > T2.uniqueField

(this assumes that you have an integer based unique field)

Personally though I'd say you were better off trying to correct the fact that duplicate entries are being added to the database before it occurs rather than as a post fix-it operation.

answered Sep 25 '22 08:09

Ben Cawley

Related questions
                            
                                SQL NVARCHAR and VARCHAR Limits
                            
                                Difference between INNER JOIN and LEFT SEMI JOIN
                            
                                '^M' character at end of lines
                            
                                how to update the multiple rows at a time using linq to sql?
                            
                                mysql delete under safe mode
                            
                                Android Room @Delete with parameters
                            
                                Dynamic SQL - EXEC(@SQL) versus EXEC SP_EXECUTESQL(@SQL)
                            
                                How to call a stored procedure from Java and JPA
                            
                                INSERT vs INSERT INTO
                            
                                How to Alter Constraint
                            
                                How to get first/top row of the table in Sqlite via Sql Query
                            
                                Are there disadvantages to using a generic varchar(255) for all text-based fields?
                            
                                What is the difference between "LINQ to Entities", "LINQ to SQL" and "LINQ to Dataset"
                            
                                conditional unique constraint
                            
                                Cannot execute script: Insufficient memory to continue the execution of the program
                            
                                How to get current instance name from T-SQL
                            
                                How do I copy data from one table to another in postgres using copy command
                            
                                How to delete multiple rows in SQL where id = (x to y)
                            
                                WHERE Clause to find all records in a specific month
                            
                                How to check if field is null or empty in MySQL?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With