Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete duplicate records from a SQL table without a primary key

I have the below table with the below records in it

create table employee (  EmpId number,  EmpName varchar2(10),  EmpSSN varchar2(11) );  insert into employee values(1, 'Jack', '555-55-5555'); insert into employee values (2, 'Joe', '555-56-5555'); insert into employee values (3, 'Fred', '555-57-5555'); insert into employee values (4, 'Mike', '555-58-5555'); insert into employee values (5, 'Cathy', '555-59-5555'); insert into employee values (6, 'Lisa', '555-70-5555'); insert into employee values (1, 'Jack', '555-55-5555'); insert into employee values (4, 'Mike', '555-58-5555'); insert into employee values (5, 'Cathy', '555-59-5555'); insert into employee values (6 ,'Lisa', '555-70-5555'); insert into employee values (5, 'Cathy', '555-59-5555'); insert into employee values (6, 'Lisa', '555-70-5555'); 

I dont have any primary key in this table .But i have the above records in my table already. I want to remove the duplicate records which has the same value in EmpId and EmpSSN fields.

Ex : Emp id 5

Can any one help me to frame a query to delete those duplicate records

Thanks in advance

like image 750
Shyju Avatar asked Jun 12 '09 07:06

Shyju


People also ask

How remove duplicates without primary key in SQL?

So to delete the duplicate record with SQL Server we can use the SET ROWCOUNT command to limit the number of rows affected by a query. By setting it to 1 we can just delete one of these rows in the table. Note: the select commands are just used to show the data prior and after the delete occurs.

How delete multiple duplicate rows in SQL?

RANK function to SQL delete duplicate rows We can use the SQL RANK function to remove the duplicate rows as well. SQL RANK function gives unique row ID for each row irrespective of the duplicate row. In the following query, we use a RANK function with the PARTITION BY clause.

How do I remove duplicate records from a select query?

The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.


2 Answers

It is very simple. I tried in SQL Server 2008

DELETE SUB FROM (SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt  FROM Employee) SUB WHERE SUB.cnt > 1 
like image 80
Anjib Rajkhowa Avatar answered Sep 28 '22 12:09

Anjib Rajkhowa


Add a Primary Key (code below)

Run the correct delete (code below)

Consider WHY you woudln't want to keep that primary key.


Assuming MSSQL or compatible:

ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;  WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1) BEGIN     DELETE FROM Employee WHERE EmployeeID IN      (         SELECT MIN(EmployeeID) as [DeleteID]         FROM Employee         GROUP BY EmpID, EmpSSN         HAVING COUNT(*) > 1     ) END 
like image 35
cjk Avatar answered Sep 28 '22 11:09

cjk