I've added a new column, packageNo, to my table:
create table Packages(
id varchar(20) primary key, -- ok, I know :)
orderNo uniqueIdentifier not null,
orderlineNo int not null,
packageNo int not null default(0)
)
Now I want to generate the packageNo with the following rules:
My problem is that the script I wrote uses 15 minutes for 26500 rows on my testServer. Here it is:
set NoCount ON
declare @Counter int
declare @handledCounter int
declare @currentorder uniqueIdentifier
declare @fetchedOrder uniqueIdentifier
declare @fetchedId varchar(20) -- will using PK speed up things?
declare PackageNo_Cursor cursor for
select orderNo, id from packages order by orderNo, orderlineNo for update of packageNo
open PackageNo_Cursor
fetch next from PackageNo_Cursor into @fetchedOrder, @fetchedId
set @currentOrder = @fetchedOrder
set @counter = 0
set @handledCounter = 0
while @@fetch_status = 0
begin
if (@currentOrder <> @fetchedOrder)
begin -- reset counter for each order
set @currentOrder = @fetchedOrder
set @counter = 0
end
set @counter = @counter + 1
set @handledCounter = @handledCounter +1
if (@handledCounter % 50 = 0)
begin
print 'handled = ' + cast(@handledCounter as varchar)
end
update packages set packageNo = @counter where current of PackageNo_Cursor
fetch next from PackageNo_Cursor into @fetchedOrder, @fetchedId
end
close PackageNo_Cursor
deallocate PackageNo_Cursor
This should result in:
id - orderno - lineNo - packageNo (what I need to set)
ean1 - guid1 - 1 - 1
ean2 - guid1 - 2 - 2
ean3 - guid2 - 1 - 1
ean15- guid2 - 3 - 2
ean15- guid2 - 4 - 3
Can I make this run any faster?
You don't want to use a cursor, you want to do this as a set, like so:
update p set
packageNo = r.packageNo
from
packages p
inner join
(select orderNo, orderLineNo,
ROW_NUMBER() OVER(PARTITION BY orderNo ORDER BY orderLineNo) as packageNo
from packages) r on
p.orderNo = r.orderNo
and p.orderLineNo = r.orderLineNo
This will leverage SQL Server's ROW_NUMBER function to get you the correct count by each line. Using UPDATE...FROM, we can create an inner join from which to update. This is vastly more efficient than the cursor.
See, cursors add an iterative ability to a procedural and set based language (SQL). The two don't play well together. That update statement is being called in order for each row (and opening/committing a transaction, to boot). If you do it all in one statement, SQL can parallelize this to make it much faster.
The standard rule is this: Use cursors as sparingly as possible. There are some fringe cases where they're necessary, but if you aren't doing massive amounts of SQL administration a day, it's doubtful you'll ever come across those cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With