Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine why a distributed transaction is timing out

I am using LINQ to SQL and a third party SDK that supports distributed transactions. When I realize that a pending update will be updating both SQL records and records in the third party SDK, I am creating a TransactionScope with a 0 (presumably infinite) timeout (although I've also tried 12 hours as a timespan parameter). Then I use GetDtcTransaction on the ambient transaction (created by transactionscope) to get a DTC transaction to link to the third party SDK. Things work nicely for about 10 minutes, but after 10 minutes, the transaction disappears and an error occurs. How do I determine why the transaction is disappearing. I suspect it's a timeout because it regularly occurs after 10 minutes even though slightly varying degrees of work have been done at that point. But I'm at a loss about how to determine what terminated the transaction, why, and how to extend its life.

I've tried tracingthe following events with SQL profiler:

  1. All error and warning events
  2. All Security events except "Audit Schema Object" events
  3. All Transaction events except SQLTransaction and TransactionLog events

All I get around the time of the error are these events:

<Event id="19" name="DTCTransaction">
  <Column id="3" name="DatabaseID">1</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="35" name="DatabaseName">master</Column>
  <Column id="51" name="EventSequence">167065</Column>
  <Column id="12" name="SPID">10</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">{D662BBC4-21EC-436D-991C-DCB061A34782}</Column>
  <Column id="21" name="EventSubClass">16</Column>
  <Column id="25" name="IntegerData">0</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="2" name="BinaryData">C4BB62D6EC216D43991CDCB061A34782</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:32.82-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="31" name="Error">1222</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167066</Column>
  <Column id="12" name="SPID">19</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">Error: 1222, Severity: 16, State: 18</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:34.717-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">18</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Error: 8525, Severity: 16, State: 1</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167067</Column>
</Event>
<Event id="162" name="User Error Message">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Distributed transaction completed. Either enlist this session in a new transaction or the NULL transaction.</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167068</Column>
</Event>

EventSubClass 16 on the DTCTransaction event indicates "Transaction is aborting".

like image 662
BlueMonkMN Avatar asked Nov 10 '09 22:11

BlueMonkMN


3 Answers

To extend the timeout, which defaults to 10 minutes maximum when not specified, it's necessary to update C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\CONFIG\Machine.config on the target system (look under C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG if you're running 64-bit). Add this as the last item right under the root level:

<system.transactions>
    <machineSettings maxTimeout="23:00:00"/>
</system.transactions>

This will (as an example) set the timeout to 23 hours.

The effective value is visible in System.Transactions.TransactionManager.MaximumTimeout

like image 85
BlueMonkMN Avatar answered Oct 07 '22 11:10

BlueMonkMN


This is probably obvious to everyone reading except me, but I just got stuck on this issue and wanted to mention how I fixed it. Even though I modified the file at the location indicated by BlueMonkMN, I was still receiving the default transaction timeout of 10 minutes. Since I'm running Windows 7 64 Bit, the machine.config file location for .NET is at the following location:

C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG

Note the "Framework64" folder is different than above.

like image 24
yanigisawa Avatar answered Oct 07 '22 13:10

yanigisawa


Could it be the SqlConnection timing out and not the distributed transaction?

Update 1

You could use the SQL Server Profiler to try and monitor for an unexpected connection-drop. You'll just want to make sure to water your trace profile to only include the events you need to monitor as it's output can be quite verbose. I'd start with only monitoring the "Audit Login" and "Audit Logout" events that can be found under the "Security Audit" event category.

If you're profiling on anything other than a stand-alone/used-only-by-yourself SQL Server instance you'll probably want to apply a filter so only events originating from your host show up in the output.

You might want to explicitly specify a timeout value in your connection string - set it really low and see if you get the same behavior a lot quicker.

Update 2

From your trace log, I see two exceptions, whose details are:

  • Error: 1222, Severity: 16, State: 18
  • Error: 8525, Severity: 16, State: 1

The Googling for the 1222 exception turned up http://www.sqlservercentral.com/Forums/Topic579864-146-1.aspx#bm645422, which states:

That error means that a lock was requested in msdb and timed out. Usually, that's going to mean it's a big transaction on a big temp table or a big sort, or something of that type.

Do you have any particularly long-running queries that might be associated with it? Perhaps a heavy duty report, or something similar to that?

Hopefully this gets you a little farther.

like image 43
Yoopergeek Avatar answered Oct 07 '22 11:10

Yoopergeek