Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deadlock in SQL Server 2005! Two real-time bulk upserts are fighting. WHY?

alt text

Here's the scenario:

I've got a table called MarketDataCurrent (MDC) that has live updating stock prices.

I've got one process called 'LiveFeed' which reads prices streaming from the wire, queues up inserts, and uses a 'bulk upload to temp table then insert/update to MDC table.' (BulkUpsert)

I've got another process which then reads this data, computes other data, and then saves the results back into the same table, using a similar BulkUpsert stored proc.

Thirdly, there are a multitude of users running a C# Gui polling the MDC table and reading updates from it.

Now, during the day when the data is changing rapidly, things run pretty smoothly, but then, after market hours, we've recently started seeing an increasing number of Deadlock exceptions coming out of the database, nowadays we see 10-20 a day. The imporant thing to note here is that these happen when the values are NOT changing.

Here's all the relevant info:

Table Def:

CREATE TABLE [dbo].[MarketDataCurrent](
 [MDID] [int] NOT NULL,
 [LastUpdate] [datetime] NOT NULL,
 [Value] [float] NOT NULL,
 [Source] [varchar](20) NULL, 
CONSTRAINT [PK_MarketDataCurrent] PRIMARY KEY CLUSTERED 
(
 [MDID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

-

alt text

I've got a Sql Profiler Trace Running, catching the deadlocks, and here's what all the graphs look like.

alt text

Process 258 is called the following 'BulkUpsert' stored proc, repeatedly, while 73 is calling the next one:

ALTER proc [dbo].[MarketDataCurrent_BulkUpload]
 @updateTime datetime,
 @source varchar(10)
as

begin transaction

update c with (rowlock) set LastUpdate = getdate(), Value = t.Value, Source = @source 
from MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid
where c.lastUpdate < @updateTime
and   c.mdid not in (select mdid from MarketData where LiveFeedTicker is not null     and     PriceSource like 'LiveFeed.%')
and   c.value <> t.value

insert  into MarketDataCurrent
with (rowlock)
select  MDID, getdate(), Value, @source from #MDTUP 
where mdid not in (select mdid from MarketDataCurrent with (nolock))
and  mdid not in (select mdid from MarketData where LiveFeedTicker is not null     and PriceSource like 'LiveFeed.%')

commit

And the other one:

ALTER PROCEDURE [dbo].[MarketDataCurrent_LiveFeedUpload] 
AS
begin transaction

 -- Update existing mdid
 UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source 
 FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid;

 -- Insert new MDID
 INSERT INTO MarketDataCurrent with (ROWLOCK) SELECT * FROM #TEMPTABLE2 
 WHERE MDID NOT IN (SELECT MDID FROM MarketDataCurrent with (NOLOCK))

 -- Clean up the temp table
 DELETE #TEMPTABLE2

commit

To clarify, those Temp Tables are being created by the C# code on the same connection and are populated using the C# SqlBulkCopy class.

To me it looks like it's deadlocking on the PK of the table, so I tried removing that PK and switching to a Unique Constraint instead but that increased the number of deadlocks 10-fold.

I'm totally lost as to what to do about this situation and am open to just about any suggestion.

HELP!!


In response to the request for the XDL, here it is:

<deadlock-list>
 <deadlock victim="processc19978">
  <process-list>
   <process id="processaf0b68" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (d900ed5a6cc6)" waittime="718" ownerId="1102128174" transactionname="user_transaction" lasttranstarted="2010-06-11T16:30:44.750" XDES="0xffffffff817f9a40" lockMode="U" schedulerid="3" kpid="8228" status="suspended" spid="73" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-11T16:30:44.750" lastbatchcompleted="2010-06-11T16:30:44.750" clientapp=".Net SqlClient Data Provider" hostname="RISKAPPS_VM" hostpid="3836" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1102128174" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
    <executionStack>
     <frame procname="MKP_RISKDB.dbo.MarketDataCurrent_BulkUpload" line="28" stmtstart="1062" stmtend="1720" sqlhandle="0x03000600a28e5e4ef4fd8e00849d00000100000000000000">
UPDATE c WITH (ROWLOCK) SET LastUpdate = getdate(), Value = t.Value, Source = @source 
FROM MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid
WHERE c.lastUpdate &lt; @updateTime
and   c.mdid not in (select mdid from MarketData where BloombergTicker is not null and PriceSource like &apos;Blbg.%&apos;)
and   c.value &lt;&gt; t.value     </frame>
     <frame procname="adhoc" line="1" stmtstart="88" sqlhandle="0x01000600c1653d0598706ca7000000000000000000000000">
exec MarketDataCurrent_BulkUpload @clearBefore, @source     </frame>
     <frame procname="unknown" line="1" sqlhandle="0x000000000000000000000000000000000000000000000000">
unknown     </frame>
    </executionStack>
    <inputbuf>
(@clearBefore datetime,@source nvarchar(10))exec MarketDataCurrent_BulkUpload @clearBefore, @source    </inputbuf>
   </process>
   <process id="processc19978" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (74008e31572b)" waittime="718" ownerId="1102128228" transactionname="user_transaction" lasttranstarted="2010-06-11T16:30:44.780" XDES="0x380be9d8" lockMode="U" schedulerid="5" kpid="8464" status="suspended" spid="248" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-11T16:30:44.780" lastbatchcompleted="2010-06-11T16:30:44.780" clientapp=".Net SqlClient Data Provider" hostname="RISKBBG_VM" hostpid="4480" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1102128228" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
    <executionStack>
     <frame procname="MKP_RISKDB.dbo.MarketDataCurrentBlbgRtUpload" line="14" stmtstart="840" stmtend="1220" sqlhandle="0x03000600005f9d24c8878f00849d00000100000000000000">
UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source 
        FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid;

        -- Insert new MDID     </frame>
     <frame procname="adhoc" line="1" sqlhandle="0x010006004a58132228bf8d73000000000000000000000000">
MarketDataCurrentBlbgRtUpload     </frame>
    </executionStack>
    <inputbuf>
MarketDataCurrentBlbgRtUpload    </inputbuf>
   </process>
  </process-list>
  <resource-list>
   <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock5ba77b00" mode="U" associatedObjectId="72057594090487808">
    <owner-list>
     <owner id="processc19978" mode="U"/>
    </owner-list>
    <waiter-list>
     <waiter id="processaf0b68" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
   <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock65dca340" mode="U" associatedObjectId="72057594090487808">
    <owner-list>
     <owner id="processaf0b68" mode="U"/>
    </owner-list>
    <waiter-list>
     <waiter id="processc19978" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
  </resource-list>
 </deadlock>
</deadlock-list>
like image 689
skimania Avatar asked Jun 11 '10 13:06

skimania


3 Answers

The deadlock seems to be straight forward deadlocks on key access order. One trivial explanation is overlap of the updated keys between the two bulk update operations.

A less trivial explanation though is that in SQL Server (and other servers too) the locked keys are hashed, and there is a (pretty significant) hash collision probability. This would explain why you see more deadlocks lately compared with before: simply your volume of data has increased and therefore the collision probability has increased. If this seems esoteric and improbable, just read on at %%lockres%% collision probability magic marker: 16,777,215, and the article linked from that. The probability is surprisingly high, for a perfect key distribution you have 50% collision probability after only ~16M inserts. For normal, real world, key distributions you have significant collision probability at only few thousand inserts. Unfortunately, there is no work around. Your only solution, if this is truly the problem, is to reduce the size of the batches (the size of the #temp tables) so that the collision probability is reduced. Or deal with deadlocks and retry... which you'll have to do anyway, but at least you can deal with fewer deadlocks.

like image 157
Remus Rusanu Avatar answered Oct 10 '22 03:10

Remus Rusanu


It's happening after main business hours, the data isn't changing and it just started recently. Did anything recently change on the server? I would suspect some new database maintenance job might be interfering.

BTW if you know the market is closed and the data isn't changing, why is your process still running?

like image 39
HLGEM Avatar answered Oct 10 '22 02:10

HLGEM


I'd like to answer one question I asked in a comment, which is,

"How do you identify the rows which are locking?".

In the following deadlock XDL, on the two "process" nodes which are locking, there is a waitresource attribute. in this case:

waitresource="KEY: 6:72057594090487808 (d4005c04b35f)

and

waitresource="KEY: 6:72057594090487808 (b00072ea4ffd)

Using the %%lockres%% keyword Remus pointed to,

select %%lockres%%, * from MarketDataCurrent 
   where %%lockres%% in ('(d4005c04b35f)', '(b00072ea4ffd)')

This yielded the two rows which are conflicting. they are indeed unique ids, and there is no collision. I still don't know why I'm getting a deadlock here, but I'm getting closer.

I will note that both of the id's are supposed to only be coming from the LiveFeed program, but again, there is a clause in the update that is supposed to be filtering out this row from actually updating from the other side.

<deadlock-list>
 <deadlock victim="processffffffff8f5872e8">
  <process-list>
   <process id="process8dcb68" taskpriority="0" logused="1256" waitresource="KEY: 6:72057594090487808 (d4005c04b35f)" waittime="1906" ownerId="1349627324" transactionname="user_transaction" lasttranstarted="2010-06-16T16:50:04.727" XDES="0x424e6258" lockMode="U" schedulerid="2" kpid="1004" status="suspended" spid="683" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-16T16:50:04.727" lastbatchcompleted="2010-06-16T16:50:04.727" clientapp=".Net SqlClient Data Provider" hostname="RISKAPPS_VM" hostpid="2600" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1349627324" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
    <executionStack>
     <frame procname="MKP_RISKDB.dbo.MarketDataCurrent_BulkUpload" line="28" stmtstart="1062" stmtend="1720" sqlhandle="0x03000600a28e5e4ef4fd8e00849d00000100000000000000">
        UPDATE c WITH (ROWLOCK) SET LastUpdate = getdate(), Value = t.Value, Source = @source 
        FROM MarketDataCurrent c INNER JOIN #MDTUP t ON c.MDID = t.mdid
        WHERE c.lastUpdate &lt; @updateTime
        and   c.mdid not in (select mdid from MarketData where BloombergTicker is not null and PriceSource like &apos;Blbg.%&apos;)
        and   c.value &lt;&gt; t.value     </frame>
             <frame procname="adhoc" line="1" stmtstart="88" sqlhandle="0x01000600c1653d0598706ca7000000000000000000000000">
        exec MarketDataCurrent_BulkUpload @clearBefore, @source     </frame>
     <frame procname="unknown" line="1" sqlhandle="0x000000000000000000000000000000000000000000000000">unknown</frame>
    </executionStack>
    <inputbuf>(@clearBefore datetime,@source nvarchar(10))exec MarketDataCurrent_BulkUpload @clearBefore, @source</inputbuf>
   </process>
   <process id="processffffffff8f5872e8" taskpriority="0" logused="0" waitresource="KEY: 6:72057594090487808 (b00072ea4ffd)" waittime="1921" ownerId="1349627388" transactionname="user_transaction" lasttranstarted="2010-06-16T16:50:04.757" XDES="0x289ea040" lockMode="U" schedulerid="5" kpid="11192" status="suspended" spid="382" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2010-06-16T16:50:04.757" lastbatchcompleted="2010-06-16T16:50:04.757" clientapp=".Net SqlClient Data Provider" hostname="RISKBBG_VM" hostpid="2452" loginname="RiskOpt" isolationlevel="read committed (2)" xactid="1349627388" currentdb="6" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
    <executionStack>
     <frame procname="MKP_RISKDB.dbo.MarketDataCurrentBlbgRtUpload" line="14" stmtstart="840" stmtend="1220" sqlhandle="0x03000600005f9d24c8878f00849d00000100000000000000">
        UPDATE c WITH (ROWLOCK) SET LastUpdate = t.LastUpdate, Value = t.Value, Source = t.Source 
        FROM MarketDataCurrent c INNER JOIN #TEMPTABLE2 t ON c.MDID = t.mdid;
    </frame>
     <frame procname="adhoc" line="1" sqlhandle="0x010006004a58132228bf8d73000000000000000000000000">
        MarketDataCurrentBlbgRtUpload     </frame>
    </executionStack>
    <inputbuf>
        MarketDataCurrentBlbgRtUpload    </inputbuf>
   </process>
  </process-list>
  <resource-list>
   <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock409d32c0" mode="U" associatedObjectId="72057594090487808">
    <owner-list>
     <owner id="processffffffff8f5872e8" mode="U"/>
    </owner-list>
    <waiter-list>
     <waiter id="process8dcb68" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
   <keylock hobtid="72057594090487808" dbid="6" objectname="MKP_RISKDB.dbo.MarketDataCurrent" indexname="PK_MarketDataCurrent" id="lock706647c0" mode="U" associatedObjectId="72057594090487808">
    <owner-list>
     <owner id="process8dcb68" mode="U"/>
    </owner-list>
    <waiter-list>
     <waiter id="processffffffff8f5872e8" mode="U" requestType="wait"/>
    </waiter-list>
   </keylock>
  </resource-list>
 </deadlock>
</deadlock-list>
like image 1
skimania Avatar answered Oct 10 '22 04:10

skimania