How can I remove duplicate rows?

Tags:

What is the best way to remove duplicate rows from a fairly large SQL Server table (i.e. 300,000+ rows)?

The rows, of course, will not be perfect duplicates because of the existence of the RowID identity field.

MyTable

RowID int not null identity(1,1) primary key, Col1 varchar(20) not null, Col2 varchar(2048) not null, Col3 tinyint not null

268

asked Aug 20 '08 21:08

Seibar

1 Answers

Assuming no nulls, you GROUP BY the unique columns, and SELECT the MIN (or MAX) RowId as the row to keep. Then, just delete everything that didn't have a row id:

DELETE FROM MyTable LEFT OUTER JOIN (    SELECT MIN(RowId) as RowId, Col1, Col2, Col3     FROM MyTable     GROUP BY Col1, Col2, Col3 ) as KeepRows ON    MyTable.RowId = KeepRows.RowId WHERE    KeepRows.RowId IS NULL

In case you have a GUID instead of an integer, you can replace

MIN(RowId)

with

CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn)))

answered Oct 06 '22 09:10

Mark Brackett

Related questions
                            
                                Reset identity seed after deleting records in SQL Server
                            
                                Should I use != or <> for not equal in T-SQL?
                            
                                How can I list all foreign keys referencing a given table in SQL Server?
                            
                                How to Join to first row
                            
                                How can I get column names from a table in SQL Server?
                            
                                How can foreign key constraints be temporarily disabled using T-SQL?
                            
                                Exclude a column using SELECT * [except columnA] FROM tableA?
                            
                                Update a table using JOIN in SQL Server?
                            
                                Function vs. Stored Procedure in SQL Server
                            
                                Search text in stored procedure in SQL Server
                            
                                How do I get list of all tables in a database using TSQL?
                            
                                When should I use CROSS APPLY over INNER JOIN?
                            
                                SQL update from one Table to another based on a ID match
                            
                                Parameterize an SQL IN clause
                            
                                How do I escape a single quote in SQL Server?
                            
                                Difference between JOIN and INNER JOIN
                            
                                What do Clustered and Non-Clustered index actually mean?
                            
                                SQL Server - Best way to get identity of inserted row?
                            
                                Check if table exists in SQL Server
                            
                                Altering a column: null to not null

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I remove duplicate rows?

Tags:

sql-server

tsql

duplicates

Seibar

People also ask

1 Answers

Mark Brackett

Recent Activity

Donate For Us