Recently I have come across a deadlock scenario on OLTP box (Sql server 2005) of a client and found that it's caused by two stored procedures being called by 2 different threads.
1, Insert sp that inserts data in X table.
Insert Into X (col1 , col2 , col3 )
Values ('value 1' , 'value 2' , 'value 3' )
2, Delete sp that deletes data from X table.
DELETE X
FROM X T1 WITH (NOLOCK)
INNER JOIN Y T2 WITH (NOLOCK)
ON T1.[col2] = T2.[col2]
WHERE t2.date < 'date time value'
X table has one unique, clustered primary key and two non clustered, non unique indexes. I have analysed the deadlock by setting t1222 tace flag on and the output is summarized below;
Insert sp acquired a IX lock on Non clustered index for Column 1. Delete sp is waiting for X lock on the same Non clustered index for column 1 during this time.
Delete sp acquired a U lock on Non clustered index for Column 2. Insert sp is waiting for a IX lock on the same Non clustered index for Column 2 during this time.
Any idea or suggestion to avoid deadlock would be really appreciated.
EDIT
output of trace flag t1222
deadlock-list
deadlock victim=process3c77d68
process-list
process id=process3c12c58 taskpriority=0 logused=1044 waitresource=PAGE: 17:8:7726 waittime=1250 ownerId=5169682909 transactionname=user_transaction lasttranstarted=2011-02-03T03:34:03.443 XDES=0xfe64d78b0 lockMode=IX schedulerid=2 kpid=9544 status=suspended spid=219 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2011-02-03T03:34:03.457 lastbatchcompleted=2011-02-03T03:34:03.453 clientapp=.Net SqlClient Data Provider hostname=HQMTSRV026 hostpid=3308 loginname=EASUser isolationlevel=read committed (2) xactid=5169682909 currentdb=17 lockTimeout=4294967295 clientoption1=671088672 clientoption2=128056
executionStack
frame procname=adhoc line=1 stmtend=296 sqlhandle=0x0200000084ce2a1d0e95a5623fa3a9c0981d422e33cab999
(@1 int<c/>@2 varchar(8000)<c/>@3 nvarchar(4000))INSERT INTO [VB_Audit_TransactionDetail]([ItemID]<c/>[TransactionID]<c/>[ItemValue]) values(@1<c/>@2<c/>@3)
frame procname=adhoc line=1 stmtend=296 sqlhandle=0x02000000afcb1733f435fb93e13556600acf32bb32e10020
Insert Into VB_Audit_TransactionDetail (ItemID <c/> TransactionID <c/> ItemValue ) Values (4 <c/> '0255978c-f56e-477e-b361-8abe62433cff' <c/> N'HQOLB006' )
frame procname=EAS.dbo.SP_Insert line=13 stmtstart=482 stmtend=522 sqlhandle=0x03001100805efa5997d69400719600000100000000000000
exec (@CommandText)
inputbuf
Proc [Database Id = 17 Object Id = 1509580416]
process id=process3c77d68 taskpriority=0 logused=364 waitresource=PAGE: 17:6:334008 waittime=1234 ownerId=5169682116 transactionname=user_transaction lasttranstarted=2011-02-03T03:34:03.053 XDES=0xa8e297cd0 lockMode=X schedulerid=12 kpid=10300 status=suspended spid=327 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2011-02-03T03:33:41.137 lastbatchcompleted=2011-02-03T03:33:41.133 clientapp=Microsoft SQL Server hostname=HQSSISSRV002 hostpid=7632 loginname=NBKDOM\SQLCSRVC isolationlevel=read committed (2) xactid=5169682116 currentdb=17 lockTimeout=4294967295 clientoption1=671350816 clientoption2=128056
executionStack
frame procname=EAS.dbo.PurgeAuditTransactionTables line=59 stmtstart=4202 stmtend=4728 sqlhandle=0x030011006354a2313d11ae00979a00000100000000000000
DELETE [dbo].[Audit_TransactionDetail]
FROM [dbo].[Audit_TransactionDetail] T1 WITH (NOLOCK)
INNER JOIN [dbo].[Audit_NBKTransaction] T2 WITH (NOLOCK)ON T1.[TransactionID] = T2.[TransactionID]
WHERE TransactionPostedDateTime < @LastReplicationDateTime
frame procname=adhoc line=1 sqlhandle=0x0100110096968c0560c430ff190000000000000000000000
EXEC PurgeAuditTransactionTables '02 Feb 2011 19:00:13:870'
inputbuf
EXEC PurgeAuditTransactionTables '02 Feb 2011 19:00:13:870'
resource-list
pagelock fileid=8 pageid=7726 dbid=17 objectname=EAS.dbo.Audit_TransactionDetail id=lock4f79500 mode=U associatedObjectId=886415243542528
owner-list
owner id=process3c77d68 mode=U
waiter-list
waiter id=process3c12c58 mode=IX requestType=wait
pagelock fileid=6 pageid=334008 dbid=17 objectname=EAS.dbo.Audit_TransactionDetail id=lock846afca00 mode=IX associatedObjectId=604940266831872
owner-list
owner id=process3c12c58 mode=IX
waiter-list
waiter id=process3c77d68 mode=X requestType=wait
One more important thing; the delete and insert statements always touches 2 different sets of data.
1, Insert sp that inserts data in X table. 2, Delete sp that deletes data from X table. X table has one unique, clustered primary key and two non clustered, non unique indexes.
Useful ways to avoid and minimize SQL Server deadlocksTry to keep transactions short; this will avoid holding locks in a transaction for a long period of time. Access objects in a similar logical manner in multiple transactions. Create a covering index to reduce the possibility of a deadlock.
Rather than post a description of your understanding of the deadlock graph, post the deadlock graph itself. the XML, not a bitmap of the graphical rendering. At a first glance, the fact that there is a conflict on an IX lock involved suggest lock escalation is occurring, which indicates there is no index to service the DELETE, or that the index tipping point on the join is hit. But then again, this is just speculation due to insufficient information. To give any meaningful answer, one would need the actual deadlock XML and the exact schema definition of the object involved.
After UPDATE
You still didn't provide the information requested: the exact schema of all objects involved, including the clustered and all non-clustered index definitions. Until then, the initial suspicion holds: the DELETE is doing a table scan and is has escalated to page lock granularity. This is due to bad indexing.
Your evaluation that "the delete and insert statements always touches 2 different sets of data" is wrong on two accounts:
As a side note, audit tables are almost always required to be clustered by incident date/time because all queries on them requests specific time intervals ('what happened between ... and ...') and item seeks can be satisfied by a non-clustered primary key in ID. Purging audit records, even when properly clustered, is riddled with performance problems and requires batching to avoid log explosion. The best solution is to deploy an automated sliding window using partitioning, but that comes with its own challenges.
I'd guess that table T2 has to be pretty big. Is column t2.date indexed? If not, then a table scan on a big table could cause your problems. Indexing that column could optimize the delete by avoiding the table scan. Alternatively, if the indexes on col1 or col2 are not really being used (or used enough), dropping them might also avoid the problem.
How often do these deadlocks occur? If they are very infrequent, a kludgy work-around might suffice: Wrap each statement in a try/catch block, in the catch check if the error was due to deadlock, and if so retry the command. You can also carefully use SET DEADLOCK_PRIORITY
to pick which query will always win/lose (but you have to balance this with all calls to the table).
Oh, and drop those WITH (NOLOCK)
s. NOLOCK is ignored by inserts, updates, and deletes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With