Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking for duplicate data in SQL Server

Please don't ask me why but there is a lot of duplicate data where every field is duplicated.

For example

alex, 1
alex, 1
liza, 32
hary, 34

I will need to eliminate from this table one of the alex, 1 rows

I know this algorithm will be very ineffecient, but it does not matter. I will need to remove duplicate data.

What is the best way to do this? Please keep in mind I do not have 2 fields, I actually have about 10 fields to check on.

like image 252
Alex Gordon Avatar asked Dec 07 '22 21:12

Alex Gordon


2 Answers

As you said, yes this will be very inefficient, but you can try something like

DECLARE @TestTable TABLE(
        Name VARCHAR(20),
        SomeVal INT
)
INSERT INTO @TestTable SELECT 'alex', 1
INSERT INTO @TestTable SELECT 'alex', 1
INSERT INTO @TestTable SELECT 'liza', 32
INSERT INTO @TestTable SELECT 'hary', 34

SELECT  *
FROM    @TestTable

;WITH DuplicateVals AS (
    SELECT  *,
            ROW_NUMBER() OVER (PARTITION BY Name, SomeVal ORDER BY (SELECT NULL)) RowID
    FROM    @TestTable
)
DELETE FROM DuplicateVals WHERE RowID > 1

SELECT *
FROM    @TestTable
like image 105
Adriaan Stander Avatar answered Feb 25 '23 13:02

Adriaan Stander


I understand this does not answer the specific question (eliminating dupes in SAME table), but I'm offering the solution because it is very fast and might work best for the author.

Speedy solution, if you don't mind creating a new table, create a new table with the same schema named NewTable.

Execute this SQL

 Insert into NewTable
 Select 
   name, 
   num 
 from
   OldTable
 group by
   name,
   num

Just include every field name in both the select and group by clauses.

like image 44
Brian Webster Avatar answered Feb 25 '23 14:02

Brian Webster