Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SSMS T-SQL set column for duplicate row number

I have a database with approximately 10 million rows (and 20 columns - about 4 GB) where about 10% of the rows have a duplicate column. Database is in SQL Server 2014 Express and using SSMS.

I created a new column CNT (int, null) to count the occurrences of each row where I have a duplicate ID. Desired result would look like:

ID     CNT
100    1
100    2
101    1
102    1
102    2
103    1
104    1

Not being really familiar with advanced SQL capabilities I did some research and came up with using a CTE to set the CNT column. Worked fine on a small test table - but it was obvious this is not the way to go for a large table (I killed it after 5+ hours on a pretty decent system.)

Here's the code that I attempted to implement:

with CTE as
(select dbo.database.id, dbo.database.cnt,
 RN = row_number() over (partition by id order by id)
 from dbo.databasee)
update CTE set CNT = RN

Column ID is of type Int. All columns allow nulls - there are no keys or indexed columns.

like image 467
RPMcCormick Avatar asked Apr 11 '26 10:04

RPMcCormick


1 Answers

Edit: Martin is right, I can only offer an alternate solution than the CTE at the moment. Make a new table exactly like your old one, and insert the old table's data into it with this.

INSERT INTO newTable
SELECT ID, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID)
FROM oldTable;

Then you can delete your old table. Definitely not a perfect solution, but it should work.

like image 50
saarrrr Avatar answered Apr 13 '26 11:04

saarrrr



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!