I made a mistake in a bulk insert script, so now i have "duplicate" rows with different colX. I need to delete this duplicate rows, but I cant figure out how. To be more precise, I have this: <pre class="prettyprint"><code> col1 | col2 | col3 | colX ----+---------------------- 0 | 1 | 2 | a 0 | 1 | 2 | b 0 | 1 | 2 | c 0 | 1 | 2 | a 3 | 4 | 5 | x 3 | 4 | 5 | y 3 | 4 | 5 | x 3 | 4 | 5 | z </code></pre> and I want to keep the first occurrence of each (row, colX): <pre class="prettyprint"><code> col1 | col2 | col3 | colX ----+---------------------- 0 | 1 | 2 | a 3 | 4 | 5 | x </code></pre> Thank you for your replies :)

Try the simplest approach with Sql Server's CTE: http://www.sqlfiddle.com/#!3/2d386/2 Data: <pre class="prettyprint"><code>CREATE TABLE tbl ([col1] int, [col2] int, [col3] int, [colX] varchar(1)); INSERT INTO tbl ([col1], [col2], [col3], [colX]) VALUES (0, 1, 2, 'a'), (0, 1, 2, 'b'), (0, 1, 2, 'c'), (0, 1, 2, 'a'), (3, 4, 5, 'x'), (3, 4, 5, 'y'), (3, 4, 5, 'x'), (3, 4, 5, 'z'); </code></pre> Solution: <pre class="prettyprint"><code>select * from tbl; with a as ( select row_number() over(partition by col1 order by col2, col3, colX) as rn from tbl ) delete from a where rn > 1; select * from tbl; </code></pre> Output: <pre class="prettyprint"><code>| COL1 | COL2 | COL3 | COLX | ----------------------------- | 0 | 1 | 2 | a | | 0 | 1 | 2 | b | | 0 | 1 | 2 | c | | 0 | 1 | 2 | a | | 3 | 4 | 5 | x | | 3 | 4 | 5 | y | | 3 | 4 | 5 | x | | 3 | 4 | 5 | z | | COL1 | COL2 | COL3 | COLX | ----------------------------- | 0 | 1 | 2 | a | | 3 | 4 | 5 | x | </code></pre> <hr> Or perhaps this: http://www.sqlfiddle.com/#!3/af826/1 Data: <pre class="prettyprint"><code>CREATE TABLE tbl ([col1] int, [col2] int, [col3] int, [colX] varchar(1)); INSERT INTO tbl ([col1], [col2], [col3], [colX]) VALUES (0, 1, 2, 'a'), (0, 1, 2, 'b'), (0, 1, 2, 'c'), (0, 1, 2, 'a'), (0, 1, 3, 'a'), (3, 4, 5, 'x'), (3, 4, 5, 'y'), (3, 4, 5, 'x'), (3, 4, 5, 'z'); </code></pre> Solution: <pre class="prettyprint"><code>select * from tbl; with a as ( select row_number() over(partition by col1, col2, col3 order by colX) as rn from tbl ) delete from a where rn > 1; select * from tbl; </code></pre> Output: <pre class="prettyprint"><code>| COL1 | COL2 | COL3 | COLX | ----------------------------- | 0 | 1 | 2 | a | | 0 | 1 | 2 | b | | 0 | 1 | 2 | c | | 0 | 1 | 2 | a | | 0 | 1 | 3 | a | | 3 | 4 | 5 | x | | 3 | 4 | 5 | y | | 3 | 4 | 5 | x | | 3 | 4 | 5 | z | | COL1 | COL2 | COL3 | COLX | ----------------------------- | 0 | 1 | 2 | a | | 0 | 1 | 3 | a | | 3 | 4 | 5 | x | </code></pre>

Delete "duplicate" rows in SQL Server 2010

 col1 | col2 | col3 | colX      
----+----------------------
  0   |  1   |  2   |  a
  0   |  1   |  2   |  b
  0   |  1   |  2   |  c
  0   |  1   |  2   |  a
  3   |  4   |  5   |  x
  3   |  4   |  5   |  y
  3   |  4   |  5   |  x
  3   |  4   |  5   |  z

and I want to keep the first occurrence of each (row, colX):

 col1 | col2 | col3 | colX      
----+----------------------
  0   |  1   |  2   |  a
  3   |  4   |  5   |  x

Thank you for your replies :)

317

asked Jun 21 '12 02:06

Fernando Serna

1 Answers

Try the simplest approach with Sql Server's CTE: http://www.sqlfiddle.com/#!3/2d386/2

Data:

CREATE TABLE tbl
    ([col1] int, [col2] int, [col3] int, [colX] varchar(1));

INSERT INTO tbl
    ([col1], [col2], [col3], [colX])
VALUES
    (0, 1, 2, 'a'),
    (0, 1, 2, 'b'),
    (0, 1, 2, 'c'),
    (0, 1, 2, 'a'),
    (3, 4, 5, 'x'),
    (3, 4, 5, 'y'),
    (3, 4, 5, 'x'),
    (3, 4, 5, 'z');

Solution:

select * from tbl;

with a as
(
  select row_number() over(partition by col1 order by col2, col3, colX) as rn 
  from tbl   
)
delete from a where rn > 1;

select * from tbl;

Output:

| COL1 | COL2 | COL3 | COLX |
-----------------------------
|    0 |    1 |    2 |    a |
|    0 |    1 |    2 |    b |
|    0 |    1 |    2 |    c |
|    0 |    1 |    2 |    a |
|    3 |    4 |    5 |    x |
|    3 |    4 |    5 |    y |
|    3 |    4 |    5 |    x |
|    3 |    4 |    5 |    z |


| COL1 | COL2 | COL3 | COLX |
-----------------------------
|    0 |    1 |    2 |    a |
|    3 |    4 |    5 |    x |

Or perhaps this: http://www.sqlfiddle.com/#!3/af826/1

Data:

CREATE TABLE tbl
    ([col1] int, [col2] int, [col3] int, [colX] varchar(1));

INSERT INTO tbl
    ([col1], [col2], [col3], [colX])
VALUES
    (0, 1, 2, 'a'),
    (0, 1, 2, 'b'),
    (0, 1, 2, 'c'),
    (0, 1, 2, 'a'),
    (0, 1, 3, 'a'),
    (3, 4, 5, 'x'),
    (3, 4, 5, 'y'),
    (3, 4, 5, 'x'),
    (3, 4, 5, 'z');

Solution:

select * from tbl;


with a as
(
    select row_number() over(partition by col1, col2, col3 order by colX) as rn 
    from tbl   
)
delete from a where rn > 1;

select * from tbl;

Output:

| COL1 | COL2 | COL3 | COLX |
-----------------------------
|    0 |    1 |    2 |    a |
|    0 |    1 |    2 |    b |
|    0 |    1 |    2 |    c |
|    0 |    1 |    2 |    a |
|    0 |    1 |    3 |    a |
|    3 |    4 |    5 |    x |
|    3 |    4 |    5 |    y |
|    3 |    4 |    5 |    x |
|    3 |    4 |    5 |    z |

| COL1 | COL2 | COL3 | COLX |
-----------------------------
|    0 |    1 |    2 |    a |
|    0 |    1 |    3 |    a |
|    3 |    4 |    5 |    x |

116

answered Sep 20 '22 19:09

Michael Buen

Related questions
                            
                                Java 6 - Mapping java.sql.Types to Java types
                            
                                Joining top records in T-SQL
                            
                                SQLite multiple Autoincrement Columns?
                            
                                Find Duplicates using Rank Over Partition
                            
                                Version number sorting in Sql Server
                            
                                SQL: How do I select one of two fields, depending on a third field
                            
                                How to write stored procedure?
                            
                                Select and Group by together
                            
                                sql - insert if not exists
                            
                                SQL lookup and reference table definitions
                            
                                How to convert SQL value that has newlines into proper paragraphs <p> in HTML?
                            
                                Update multiple rows for 2 columns in MySQL
                            
                                Why INT(11) when it only stores 10 digits?
                            
                                When are advantages using stored procedure over hard coded query?
                            
                                Counting the number of child records in a one-to-many relationship with SQL only
                            
                                Convert Unixtime to Datetime SQL (Oracle)
                            
                                select * from two tables with different # of columns
                            
                                Associate Database Name with Table List
                            
                                Convert SQL into json in Python [duplicate]
                            
                                Parsing sql query PHP [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Delete "duplicate" rows in SQL Server 2010

Tags:

sql

sql-server

duplicates

rows

Fernando Serna

People also ask

1 Answers

Michael Buen

Recent Activity

Donate For Us