I made a mistake in a bulk insert script, so now i have "duplicate" rows with different colX. I need to delete this duplicate rows, but I cant figure out how. To be more precise, I have this:
col1 | col2 | col3 | colX
----+----------------------
0 | 1 | 2 | a
0 | 1 | 2 | b
0 | 1 | 2 | c
0 | 1 | 2 | a
3 | 4 | 5 | x
3 | 4 | 5 | y
3 | 4 | 5 | x
3 | 4 | 5 | z
and I want to keep the first occurrence of each (row, colX):
col1 | col2 | col3 | colX
----+----------------------
0 | 1 | 2 | a
3 | 4 | 5 | x
Thank you for your replies :)
To delete the duplicate rows from the table in SQL Server, you follow these steps: Find duplicate rows using GROUP BY clause or ROW_NUMBER() function. Use DELETE statement to remove the duplicate rows.
RANK function to SQL delete duplicate rows We can use the SQL RANK function to remove the duplicate rows as well. SQL RANK function gives unique row ID for each row irrespective of the duplicate row. In the following query, we use a RANK function with the PARTITION BY clause.
On the Design tab, click Run. Verify that the query returns the records that you want to delete. Click Design View and on the Design tab, click Delete. Access changes the select query to a delete query, hides the Show row in the lower section of the design grid, and adds the Delete row.
The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.
Try the simplest approach with Sql Server's CTE: http://www.sqlfiddle.com/#!3/2d386/2
Data:
CREATE TABLE tbl
([col1] int, [col2] int, [col3] int, [colX] varchar(1));
INSERT INTO tbl
([col1], [col2], [col3], [colX])
VALUES
(0, 1, 2, 'a'),
(0, 1, 2, 'b'),
(0, 1, 2, 'c'),
(0, 1, 2, 'a'),
(3, 4, 5, 'x'),
(3, 4, 5, 'y'),
(3, 4, 5, 'x'),
(3, 4, 5, 'z');
Solution:
select * from tbl;
with a as
(
select row_number() over(partition by col1 order by col2, col3, colX) as rn
from tbl
)
delete from a where rn > 1;
select * from tbl;
Output:
| COL1 | COL2 | COL3 | COLX |
-----------------------------
| 0 | 1 | 2 | a |
| 0 | 1 | 2 | b |
| 0 | 1 | 2 | c |
| 0 | 1 | 2 | a |
| 3 | 4 | 5 | x |
| 3 | 4 | 5 | y |
| 3 | 4 | 5 | x |
| 3 | 4 | 5 | z |
| COL1 | COL2 | COL3 | COLX |
-----------------------------
| 0 | 1 | 2 | a |
| 3 | 4 | 5 | x |
Or perhaps this: http://www.sqlfiddle.com/#!3/af826/1
Data:
CREATE TABLE tbl
([col1] int, [col2] int, [col3] int, [colX] varchar(1));
INSERT INTO tbl
([col1], [col2], [col3], [colX])
VALUES
(0, 1, 2, 'a'),
(0, 1, 2, 'b'),
(0, 1, 2, 'c'),
(0, 1, 2, 'a'),
(0, 1, 3, 'a'),
(3, 4, 5, 'x'),
(3, 4, 5, 'y'),
(3, 4, 5, 'x'),
(3, 4, 5, 'z');
Solution:
select * from tbl;
with a as
(
select row_number() over(partition by col1, col2, col3 order by colX) as rn
from tbl
)
delete from a where rn > 1;
select * from tbl;
Output:
| COL1 | COL2 | COL3 | COLX |
-----------------------------
| 0 | 1 | 2 | a |
| 0 | 1 | 2 | b |
| 0 | 1 | 2 | c |
| 0 | 1 | 2 | a |
| 0 | 1 | 3 | a |
| 3 | 4 | 5 | x |
| 3 | 4 | 5 | y |
| 3 | 4 | 5 | x |
| 3 | 4 | 5 | z |
| COL1 | COL2 | COL3 | COLX |
-----------------------------
| 0 | 1 | 2 | a |
| 0 | 1 | 3 | a |
| 3 | 4 | 5 | x |
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With