Rename duplicate rows

Tags:

Here's a simplified example of my problem. I have a table where there's a "Name" column with duplicate entries:

ID    Name
---   ----
 1    AAA
 2    AAA
 3    AAA
 4    BBB
 5    CCC
 6    CCC
 7    DDD
 8    DDD
 9    DDD
10    DDD

Doing a GROUP BY like SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name results in this:

Name  Count
----  -----
AAA   3
BBB   1
CCC   2
DDD   4

I'm only concerned about the duplicates, so I'll add a HAVING clause, SELECT Name, COUNT(*) AS [Count] FROM Table GROUP BY Name HAVING COUNT(*) > 1:

Name  Count
----  -----
AAA   3
CCC   2
DDD   4

Trivial so far, but now things get tricky: I need a query to get me all the duplicate records, but with a nice incrementing indicator added to the Name column. The result should look something like this:

ID    Name
---   --------
 1    AAA
 2    AAA (2)
 3    AAA (3)
 5    CCC 
 6    CCC (2)
 7    DDD 
 8    DDD (2)
 9    DDD (3)
10    DDD (4)

Note row 4 with "BBB" is excluded, and the first duplicate keeps the original Name.

Using an EXISTS statement gives me all the records I need, but how do I go about creating the new Name value?

SELECT * FROM Table AS T1 
WHERE EXISTS (
    SELECT Name, COUNT(*) AS [Count] 
    FROM Table 
    GROUP BY Name 
    HAVING (COUNT(*) > 1) AND (Name = T1.Name))
ORDER BY Name

I need to create an UPDATE statement that will fix all the duplicates, i.e. change the Name as per this pattern.

Update: Figured it out now. It was the PARTITION BY clause I was missing.

685

asked Mar 03 '11 03:03

Jakob Gade

2 Answers

With Dups As
    (
    Select Id, Name
        , Row_Number() Over ( Partition By Name Order By Id ) As Rnk
    From Table
    )
Select D.Id
    , D.Name + Case
                When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')'
                Else ''
                End As Name
From Dups As D

If you want an update statement you can use pretty much the same structure:

With Dups As
    (
    Select Id, Name
        , Row_Number() Over ( Partition By Name Order By Id ) As Rnk
    From Table
    )
Update Table
Set Name = T.Name + Case
                    When D.Rnk > 1 Then ' (' + Cast(D.Rnk As varchar(10)) + ')'
                    Else ''
                    End
From Table As T
    Join Dups As D
        On D.Id = T.Id

191

answered Sep 28 '22 02:09

Thomas

Just update the subquery directly:

update d
set Name = Name+'('+cast(r as varchar(10))+')'
from    (   select  Name, 
                    row_number() over (partition by Name order by Name) as r
            from    [table]
        ) d
where r > 1

answered Sep 28 '22 01:09

nathan_jr

Related questions
                            
                                Indexing Performance BigInt vs VarChar
                            
                                Using case statement in update query
                            
                                Skip first row in SQL Server 2005?
                            
                                SQL Server Index Which should be clustered?
                            
                                Change the default SqlCommand CommandTimeout with configuration rather than recompile?
                            
                                How do I limit the acceptable values in a database column to be 1 to 5?
                            
                                SQL Server 2005 - How often should you rebuild Indexes?
                            
                                Can I set a default schema for within a stored procedure?
                            
                                Optimizing ROW_NUMBER() in SQL Server
                            
                                "The selected data source is on remote computer" error when creating an SSIS job
                            
                                =* operator in sql
                            
                                TSQL to know database role members
                            
                                SQL Server 2005 Convert VARCHAR to INT but default on invalid type
                            
                                How do you create SQL Server 2005 stored procedure templates in SQL Server 2005 Management Studio?
                            
                                Selecting Nth Record in an SQL Query
                            
                                Querying Active Directory from SQL Server 2005
                            
                                Counts for lots of boolean fields in one sql query?
                            
                                when insert length of lob data to be replicated exceeds configured maximum 65536
                            
                                SQL Server, temporary tables with truncate vs table variable with delete
                            
                                insert char into string at multiple positions sql

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Rename duplicate rows

Tags:

tsql

sql-server-2005

sql-update

common-table-expression

Jakob Gade

People also ask

2 Answers

Thomas

nathan_jr

Recent Activity

Donate For Us