Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Auto Increment in update statement resulting in duplicate values on large table

I had a need to renumber rows in a table using the following code from a You've Been Haacked blog post:

DECLARE @counter int
SET @counter = 0
UPDATE #myTable
SET @counter = ID = @counter + 1

This works fine with a few records to several thousand, but when I run it on a SQL Server 2012 (64-bit) instance on a table with 250K+ records, I end up with many duplicates (ID #1 appears 12 times). In all, there are about 27K records with duplicate values. The strange thing is that about 19K have exactly 12 dupes (the same number of processors on my machine).

What is the cause of the duplicates?

like image 647
C-Pound Guru Avatar asked May 14 '15 18:05

C-Pound Guru


1 Answers

This is not guaranteed to work which I'm sure Phil Haack calls out. Throw this code away.

The likely reason is parallelism. That variable might be hammered on multiple threads.

The strange thing is that about 19K have exactly 12 dupes (the same number of processors on my machine).

That fits the explanation.

Use ROW_NUMBER to generate good guaranteed to work IDs:

update t
set ID = r
from (
 select *, row_number() over (order by something) r from T
) t

Just as simple but sane.

like image 102
usr Avatar answered Sep 28 '22 07:09

usr