Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Troubleshoot Intermittent SQL Timeout Errors

We've been having a few instances per day where we get a slew of SQL Timeout errors from multiple applications (System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.) We have over 100 different applications on our network, both web and desktop apps. Everything from VB6 and Classic ASP to .NET 4. I can find all kinds of data that show the side effects but can't pinpoint what is causing this. Our DBA says nothing is wrong with the SQL server, and IT says there's nothing wrong with the web servers or network, so of course I'm left in the middle trying to troubleshoot this.

I'm really just looking for suggestions on what other troubleshooting I can do to try and track this down.

We're running SQL Server 2008 R2 in a cluster. There's a handful of different servers that connect to it, ranging from Windows server 2003 to 2008 of different varieties.

Here's what I've done so far:

  • Run SQL trace of long running queries and deadlocks. This shows no deadlocks at the times of the problems, and long running queries all coincide with our timeout errors, but look to be a side effect, and not the cause. Queries that are very basic that typically return instantly end up taking 30, 60 or 120 seconds to run at times. This happens for a few minutes then everything picks up and works fine after that.
  • Use performance monitor to track connection pool connections. This sometimes shows some spikes in the number of connections near the times of the timeouts, but still not even halfway to the default 100 connection limit. Again, nothing here that seems to point to a cause.
  • Separate web applications into different App Pools. We tried to narrow down the apps we thought may be the main problem (most chatty, etc) and put them in separate Application Pools but that doesn't seem to affect anything or help us narrow down anything.
  • Monitor disk usage on SQL Server. We've done some monitoring on the SQL server and see no spikes or any signs of problems when these timeouts are occurring.
  • Verified TempDB was not the cause of the problem.

I'll come back and add more if I think of what else we've tried. Please let me know some ideas on what to troubleshoot next.

like image 553
Shawn Steward Avatar asked Oct 12 '11 16:10

Shawn Steward


People also ask

How do you resolve connection timeout in SQL?

If you encounter a connection-timeout error, follow the steps: Increase the connection-timeout parameter. If you use an application to connect to SQL Server, increase the relevant connection-timeout parameter values and check whether the connection eventually succeeds.

What causes SQL timeout errors?

The timeout period elapsed prior to completion of the operation or the server is not responding. The customer has SQL Server replication and if they use the software pointing to the subscription the issue does not happen so it has to be something on the main publisher DB that is causing these timeouts.

What causes database timeouts?

the work you are trying to do takes longer than the default time out period. your query is running too slow. it may be this query, another query slowing everyone else down, updates blocking, or a number of reasons.


1 Answers

Run SQL trace of long running queries and deadlocks. This shows no deadlocks at the times of the problems, and long running queries all coincide with our timeout errors, but look to be a side effect, and not the cause. Queries that are very basic that typically return instantly end up taking 30, 60 or 120 seconds to run at times. This happens for a few minutes then everything picks up and works fine after that.

It looks like some queries/transaction lock your database till they are done. You have to find out which queries are blocking and rewrite them/run them at an other time to avoid blocking other processes. At this moment the waiting queries just timeout.

An extra point to dig into is the auto increment size of your transaction log and database. Set them on a fixed size instead of a percentage of the current files. If files are getting taller the time it takes to allocate enough space will eventually longer as your transaction timeout. And your db comes to a halt.

like image 123
Peter Avatar answered Sep 17 '22 00:09

Peter