The Sql Server 2008 R2 instance in question is a heavy load OLTP production server. The deadlock issue came up a few days back and is still unresolved. We received the Xml deadlock report that listed the stored procedures involved in the deadlock and some other details. I'll try to list down the facts from this xml first:
Two stored procedures are involved in the deadlock, say SP1 and SP2. According to the report SP1 was running in Isolation level "Serializable" and SP2 was running in "ReadCommitted".
We have investigated the following:
Are we setting IsolationLevel of SP1 to "Serializable" inside SP or in Code? - No.
Is any other SP whose IsolationLevel is "Serializable" calling SP1? - No.
Are the table used by SP1 called by
any other SP that has Isolation
Level as "Serializable"? - Yes.
There are SPs that have Isolation
Level set to "Serializable" and
access the same tables as SP1,
but we don't know whether they
were running at the time of
deadlock or not as the deadlock
report only showed SP1 and SP2.
Lines of thought:
We have considered the following possible causes:
Deadlock is occurring because SP1 is running as "Serializable". - Why is this SP running in Serializable when I haven't set it? Is the Isolation level escalating (like locks do)? If we figure this out and make it run as ReadCommitted, will the issue be resolved?
Any other SP is running, locking the table used by SP1 and causes a deadlock between SP1 and SP2. - Wouldn't this SP be listed in the deadlock report? Can the deadlock report miss such a dependency? If yes then we might only be getting partial information. This still doesn't resolve how SP1 is running in Serializable, though.
Suggestions:
If this information is not sufficient in resolving the problem, how can I obtain more information from SQL Server for my purpose and what information should I try to collect?
Any other Line of Thought that you'd pursue in solving this issue?
Update:
This is the trace log information for the deadlock. I've changed the names of SPs etc. but have checked and verified that the changes don't miss out any relevant information. Check the notes succeeding the code for more info on tables etc.
?<EVENT_INSTANCE>
<EventType>DEADLOCK_GRAPH</EventType>
<PostTime>2010-09-07T11:27:47.870</PostTime>
<SPID>16</SPID>
<TextData>
<deadlock-list>
<deadlock victim="process5827708">
<process-list>
<process id="process5827708" taskpriority="0" logused="0" waitresource="KEY: 7:72057594228441088 (8d008a861f4f)"
waittime="5190" ownerId="1661518243" transactionname="SELECT" lasttranstarted="2010-09-07T11:27:42.657"
XDES="0x80bf3b50" lockMode="RangeS-S" schedulerid="4" kpid="2228" status="suspended" spid="76" sbid="0"
ecid="0" priority="0" trancount="0" lastbatchstarted="2010-09-07T11:27:42.657"
lastbatchcompleted="2010-09-07T11:27:42.657" clientapp=".Net SqlClient Data Provider"
hostname="xxx" hostpid="5988" loginname="xxx" isolationlevel="serializable (4)"
xactid="1661518243" currentdb="7" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
<executionStack>
<frame procname="SP1" line="12" stmtstart="450" stmtend="6536"
sqlhandle="0x0300070090cbdc7742720c00e99d00000100000000000000">
Select ... from Table1, Table2, Table4, Table5
</frame>
</executionStack>
<inputbuf>
Proc [Database Id = 7 Object Id = 2010958736]
</inputbuf>
</process>
<process id="process5844bc8" taskpriority="0" logused="1873648" waitresource="KEY: 7:72057594228441088 (0e00ce038ed0)"
waittime="4514" ownerId="1661509575" transactionname="user_transaction" lasttranstarted="2010-09-07T11:27:40.423"
XDES="0x37979ae90" lockMode="X" schedulerid="7" kpid="3260" status="suspended" spid="104" sbid="0" ecid="0"
priority="0" trancount="2" lastbatchstarted="2010-09-07T11:27:43.350" lastbatchcompleted="2010-09-07T11:27:43.350"
clientapp=".Net SqlClient Data Provider" hostname="xxx" hostpid="5988" loginname="xxx"
isolationlevel="read committed (2)" xactid="1661509575" currentdb="7" lockTimeout="4294967295"
clientoption1="673185824" clientoption2="128056">
<executionStack>
<frame procname="SP2" line="68" stmtstart="5272" stmtend="5598"
sqlhandle="0x030007003432350f109a0c00e99d00000100000000000000">
UPDATE Table1 ...
</frame>
</executionStack>
<inputbuf>
Proc [Database Id = 7 Object Id = 255144500]
</inputbuf>
</process>
</process-list>
<resource-list>
<keylock hobtid="72057594228441088" dbid="7" objectname="Table1" indexname="Index1"
id="lock448e2c580" mode="X" associatedObjectId="72057594228441088">
<owner-list>
<owner id="process5844bc8" mode="X" />
</owner-list>
<waiter-list>
<waiter id="process5827708" mode="RangeS-S" requestType="wait" />
</waiter-list>
</keylock>
<keylock hobtid="72057594228441088" dbid="7" objectname="Table1" indexname="Index1"
id="lock2ba335880" mode="RangeS-S" associatedObjectId="72057594228441088">
<owner-list>
<owner id="process5827708" mode="RangeS-S" />
</owner-list>
<waiter-list>
<waiter id="process5844bc8" mode="X" requestType="wait" />
</waiter-list>
</keylock>
</resource-list>
</deadlock>
</deadlock-list>
</TextData>
<TransactionID />
<LoginName>xx</LoginName>
<StartTime>2010-09-07T11:27:47.867</StartTime>
<ServerName>xxx</ServerName>
<LoginSid>xxx</LoginSid>
<EventSequence>116538375</EventSequence>
<IsSystem>1</IsSystem>
<SessionLoginName />
</EVENT_INSTANCE>
SP1 is performing a select that takes data from 5 different tables (Table1 to Table5) (uses inner query etc.) SP2 performs an update on Table1.
An interesting thing is one of the columns that SP2 updates is a foreign key field in Table1 and primary key of Table2 while both Table1 and Table2 are part of the select statement of SP1, not sure this is relevant but didn't want to miss out anything.
NOTE: indexname="Index1" (in deadlock graph above) -- Index1 is on the same column that is foreign key in Table1 and primary key of Table2.
Check this MSDN article which states:
An isolation level has connection-wide scope, and once set for a connection with the SET TRANSACTION ISOLATION LEVEL statement, it remains in effect until the connection is closed or another isolation level is set. When a connection is closed and returned to the pool, the isolation level from the last SET TRANSACTION ISOLATION LEVEL statement is retained. Subsequent connections reusing a pooled connection use the isolation level that was in effect at the time the connection is pooled.
The issue was that a connection gets opened with Serializable isolation level; the associated transaction was disposed and so was the connection but the connection was not destroyed and went to the connection pool. Next time when a request for a connection was made (with same connection string) this very connection was getting returned and since the query did not specify any isolation level, it was executing in the Serializable isolation level.
Basically, if you have a connection pool and open a connection in a particular isolation level, let's say Serializable, then the connection will go back to the pool with the isolation level set to Serializable. Next time you request a connection, you can't be sure that this connection will not be returned so even through the default isolation level is ReadCommitted you may get one of these "Serializable" connections.
Another caveat is that each time you set the isolation level to Serializable (or anything else for that matter) you may be picking different connections and slowly you may pollute more and more connections in the connection pool by setting their isolation level to Serializable (or whatever you set).
I didn't find any mechanism to reset the disposing connection (when it was going back to the connection pool after executing my query). One workaround is to explicitly reset the isolation level for each connection. But this is tedious.
So the best alternative is to create separate connection pools for different isolation levels.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With