I have an application connected to a SQL Server 2014 database that combines several rows into one. There are no other connections to this database while the application is running.
First, select a chunk of rows within a specific time span. This query uses a non-clustered seek (TIME column) merged with a clustered lookup.
select ...
from FOO
where TIME >= @from and TIME < @to and ...
Then, we process these rows in c# and write changes as a single update and multiple deletes, this happens many times per chunk. These also use non-clustered index seeks.
begin tran
update FOO set ...
where NON_CLUSTERED_ID = @id
delete FOO where NON_CLUSTERED_ID in (@id1, @id2, @id3, ...)
commit
I am getting deadlocks when running this with multiple parallel chunks. I tried using ROWLOCK
for the update
and delete
but that caused even more deadlocks than before for some reason, even though there are no overlaps between chunks.
Then I tried TABLOCKX, HOLDLOCK
on the update
, but that means I can't perform my select
in parallel so I'm losing the advantages of parallelism.
Any idea how I can avoid deadlocks but still process multiple parallel chunks?
Would it be safe to use NOLOCK
on my select
in this case, given there is no row overlap between chunks? Then TABLOCKX, HOLDLOCK
would only block the update
and delete
, correct?
Or should I just accept that deadlocks will happen and retry the query in my application?
UPDATE (additional information): All deadlocks so far have happened in the update
and delete
phase, none in the select
. I'll try to get some deadlock logs up if I can't get this solved today (the correct trace flags weren't enabled before).
UPDATE: These are the two arrangements of deadlocks that occur with ROWLOCK
, they both refer only to the delete
statement and the non-clustered index it uses. I'm not sure if these are the same as the deadlocks that occur without any table hints as I wasn't able to reproduce any of those.
Ask if there's anything else needed from the .xdl, I'm a bit weary of attaching the whole thing.
A SQL Server deadlock occurs when exclusive locks are held on resources required by multiple processes and those processes cannot continue to completion.
Update lock (U) is used to avoid deadlocks. Unlike the Exclusive lock, the Update lock places a Shared lock on a resource that already has another shared lock on it.
Deadlock frequency can sometimes be reduced by ensuring that all applications access their common data in the same order - meaning, for example, that they access (and therefore lock) rows in Table A, followed by Table B, followed by Table C, and so on.
The general advice regarding deadlocks: make sure you do everything in the same order, i.e. acquire locks in the same order, for different processes.
You can find the same advice in this technical article on microsoft.com regarding Minimizing Deadlocks. There's a good reason it is listed first.
- Access objects in the same order.
- Avoid user interaction in transactions.
- Keep transactions short and in one batch.
- Use a lower isolation level.
- Use a row versioning-based isolation level.
- Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.
- Use snapshot isolation.
- Use bound connections.
Update after question from Cato:
How would acquiring locks in the same order apply here? Have you got any advice on how he would change his SQL to do that?
Deadlocks are always the same, no matter what environment: two processes (say A
& B
) acquire multiple locks (say X
& Y
) in a different order so that A
is waiting for Y
and B
is waiting for X
while A
is holding X
and B
is holding Y
.
It applies here because DELETE
and UPDATE
statements implicitely acquire locks on the rows or index range or table (depending on what the engine deems appropriate).
You should analyze your process and see if there are scenarios where locks could be acquired in a different order. If that doesn't reveal anything, you can analyze deadlocks using the SQL Server Profiler:
To trace deadlock events, add the Deadlock graph event class to a trace. This event class populates the TextData data column in the trace with XML data about the process and objects that are involved in the deadlock. SQL Server Profiler can extract the XML document to a deadlock XML (.xdl) file which you can view later in SQL Server Management Studio. You can configure SQL Server Profiler to extract Deadlock graph events to a single file that contains all Deadlock graph events, or to separate files.
I'd use sp_getapplock
in the updating transaction to prevent multiple instances of this code running in parallel. This will not block the selecting statement as table locking hints do.
You still should program the retrying logic, because it may take a while to acquire the lock, longer than the timeout parameter.
This is how the updating transaction can be wrapped into sp_getapplock
.
BEGIN TRANSACTION;
BEGIN TRY
DECLARE @VarLockResult int;
EXEC @VarLockResult = sp_getapplock
@Resource = 'some_unique_name_app_lock',
@LockMode = 'Exclusive',
@LockOwner = 'Transaction',
@LockTimeout = 60000,
@DbPrincipal = 'public';
IF @VarLockResult >= 0
BEGIN
-- Acquired the lock
update FOO set ...
where NON_CLUSTERED_ID = @id
delete FOO where NON_CLUSTERED_ID in (@id1, @id2, @id3, ...)
END ELSE BEGIN
-- return some error code, so that the caller could retry
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
-- handle the error
END CATCH;
The selecting statement doesn't need any changes.
I would recommend against NOLOCK
, even though you say that IDs in chunks do not overlap. With this hint the SELECT query can skip some pages that are being changed, it can read some pages twice. It is unlikely that such behavior can be tolerated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With