I have multiple groups of duplicates in one table (3 records for one, 2 for another, etc) - multiple rows where more than 1 exists. Below is what I came up with to delete them, but I have to run the script for however many duplicates there are: <pre class="prettyprint"><code>set rowcount 1 delete from Table where code in ( select code from Table group by code having (count(code) > 1) ) set rowcount 0 </code></pre> This works well to a degree. I need to run this for every group of duplicates, and then it only deletes 1 (which is all I need right now).

You can alternatively use <code>ROW_NUMBER()</code> function to filter out duplicates <pre class="prettyprint"><code>;WITH [CTE_DUPLICATES] AS ( SELECT RN = ROW_NUMBER() OVER (PARTITION BY SomeData ORDER BY SomeData) FROM #TempTable ) DELETE FROM [CTE_DUPLICATES] WHERE RN > 1 </code></pre>

Delete multiple duplicate rows in table

Tags:

sql

sql-server

tsql

I have multiple groups of duplicates in one table (3 records for one, 2 for another, etc) - multiple rows where more than 1 exists.

Below is what I came up with to delete them, but I have to run the script for however many duplicates there are:

set rowcount 1
delete from Table
where code in (
  select code from Table 
  group by code
  having (count(code) > 1)
)
set rowcount 0

This works well to a degree. I need to run this for every group of duplicates, and then it only deletes 1 (which is all I need right now).

955

asked Oct 12 '10 17:10

Dan

3 Answers

If you have a key column on the table, then you can use this to uniquely identify the "distinct" rows in your table.

Just use a sub query to identify a list of ID's for unique rows and then delete everything outside of this set. Something along the lines of.....

create table #TempTable
(
    ID int identity(1,1) not null primary key,
    SomeData varchar(100) not null
)

insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData3')
insert into #TempTable(SomeData) values('someData4')

select * from #TempTable

--Records to be deleted
SELECT ID
FROM #TempTable
WHERE ID NOT IN
(
    select MAX(ID)
    from #TempTable
    group by SomeData
)

--Delete them
DELETE
FROM #TempTable
WHERE ID NOT IN
(
    select MAX(ID)
    from #TempTable
    group by SomeData
)

--Final Result Set
select * from #TempTable

drop table #TempTable;

Alternatively you could use a CTE for example:

WITH UniqueRecords AS
(
    select MAX(ID) AS ID
    from #TempTable
    group by SomeData
)
DELETE A
FROM #TempTable A
    LEFT outer join UniqueRecords B on
        A.ID = B.ID
WHERE B.ID IS NULL

174

answered Sep 22 '22 15:09

John Sansom

It is frequently more efficient to copy unique rows into temporary table,
drop source table, rename back temporary table.

I reused the definition and data of #TempTable, called here as SrcTable instead, since it is impossible to rename temporary table into a regular one)

create table SrcTable
(
    ID int identity(1,1) not null primary key,
    SomeData varchar(100) not null
)

insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData3')
insert into SrcTable(SomeData) values('someData4')

by John Sansom in previous answer

-- cloning "unique" part
SELECT * INTO TempTable 
FROM SrcTable --original table
WHERE id IN  
(SELECT MAX(id) AS ID
FROM SrcTable
GROUP BY SomeData);
GO;

DROP TABLE SrcTable
GO;

sys.sp_rename 'TempTable', 'SrcTable'

answered Sep 25 '22 15:09

Gennady Vanin Геннадий Ванин

You can alternatively use ROW_NUMBER() function to filter out duplicates

;WITH [CTE_DUPLICATES] AS 
(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY SomeData ORDER BY SomeData)
FROM #TempTable
) 
DELETE FROM [CTE_DUPLICATES] WHERE RN > 1

answered Sep 22 '22 15:09

anivas

Related questions
                            
                                Limiting SQL Statement to top 5 amounts
                            
                                Comma Delimited SQL string Need to separated
                            
                                DB2 - How to run an ad hoc select query with a parameter in IBM System i Access for Windows GUI Tool
                            
                                MySQL - Calculate the net time difference between two date-times while excluding breaks?
                            
                                SQL Server: how to optimize "like" queries?
                            
                                Using a temp table in a view
                            
                                Deleting huge chunks of data from mysql innodb
                            
                                SSRS 2008 - How to hide the plus icon in a group visibility toggle cell
                            
                                SQL Server 2005 - ModifyDate column - Is using a Computed Column a correct way to implement this?
                            
                                Faster Insertion of Records into a Table with SQLAlchemy
                            
                                JOIN on another table after GROUP BY and COUNT
                            
                                Migration tool from TSQL to PL/SQL? [closed]
                            
                                Is it possible to concatenate column values into a string using CTE?
                            
                                Is there a SQL technique for ordering by matching multiple criteria?
                            
                                SQL: list of points to rectangle
                            
                                Parameterized Oracle SQL query in Java?
                            
                                How can I use an SQL statement stored in a table as part of another statement?
                            
                                How to: Manage multiple overlapping indexes in SQL Server 2005
                            
                                Simulate SQL Server database going down
                            
                                mysql JOIN ON IF()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With