Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete duplicate data and load into another table in SQL Server

I have a question about SQL Server.

Table: emp

empid   |  name |sal
1       |  abc  |100
2       |  def  |200
3       |  test |300
2       |  har  |500
3       |  jai  |600
4       | kali  |240

This table has duplicate data based on above table I want delete duplicate data from the emp table

And duplicate data should be loaded into empduplicate table.

Here empid is unique. If empid shows up multiple times, then that record is considered a duplicate.

empduplicate structure look like this:

Empid   |  name  | sal

Finally after deleting duplicate data, I want see data in emp table look like this:

empid  |  name  | sal 
1      |  abc   | 100
4      | kali   | 240

For deleting the duplicates, I tried this code:

;with duplicate as 
(
    select 
        *,
        row_number()over (partition by empid order by empid) as rn
    from emp
)
delete from duplicate 
where rn > 1

But I am not able to delete entire records.

Example: empid=2 has duplicate data

empid|name |sal
2    |def  |200
2    |har  |500

I need to delete entire empid=2 corresponding records. empid=2 has duplicate and need to delete it from the emp table.

And empduplicate table need to load duplicate data look like this:

empid    | name   |sal
2        |def     |200
2        |har     |500
3        |test    |300
3        |jai     |600

For inserting the duplicate data I tried this code:

insert into empduplicate 
    select 
        id, name, sal 
    from 
         emp  
    group by 
         id 
    having 
         count(*) > 1

That query throws an error:

Column 'duplicate.name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Please tell me how to write query to achieve my task in SQL Server

like image 484
ravi Avatar asked Oct 30 '22 12:10

ravi


1 Answers

You're almost there. Instead of using ROW_NUMBER, use COUNT:

WITH CteInsert AS(
    SELECT *,
        cnt = COUNT(empid) OVER(PARTITION BY empid)
    FROM emp
)
INSERT INTO empduplicate(empid, name, sal)
SELECT
    empid, name, sal
FROM CteInsert
WHERE cnt > 1;

WITH CteDelete AS(
    SELECT *,
        cnt = COUNT(empid) OVER(PARTITION BY empid)
    FROM emp
)
DELETE FROM CteDelete WHERE cnt > 1;

You need to execute the INSERT first before the DELETE. Additionally, you might want to enclose this in a single transaction.

like image 92
Felix Pamittan Avatar answered Nov 15 '22 05:11

Felix Pamittan