Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asp.net application slow but CPU is at 40% max

I have a strange situation on a production server. Connection for asp.net get queued but the CPU is only at 40%. Also the database runs fine at 30% CPU.

Some more history as requested in the comments:

  • In the peak hours the sites gets around 20,000 visitors an hour.
  • The site is an asp.net webforms application with a lot of AJAX/POSTs
  • The site uses a lot of User generated content
  • We measure the performance of the site with a testpage which does hit the database and the webservices used by the site. This page get served within a second on normal load. Whe define the application as slow when the request takes more than 4 seconds.
  • From the measurements we can see that the connectiontime is fast, but the processing time is large.
  • We can't pinpoint the slowresponse the a single request, the site runs fine during normal hours but gets slow during peak hours
  • We had a problem that the site was CPU bound (aka running at 100%), we fixed that
  • We also had problems with exceptions maken the appdomain restart, we fixed that do
  • During peak hours I take a look at the asp.net performance counters. We can see behaviour that we have 600 current connections with 500 queued connections.
  • At peak times the CPU is around 40% (which makes me the think that it is not CPU bound)
  • Physical memory is around 60% used
  • At peak times the DatabaseServer CPU is around 30% (which makes me think it is not Database bound)

My conclusion is that something else is stopping the server from handling the requests faster. Possible suspects

  • Deadlocks (!syncblk only gives one lock)
  • Disk I/O (checked via sysinternals procesexplorer: 3.5 mB/s)
  • Garbage collection (10~15% during peaks)
  • Network I/O (connect time still low)

To find out what the proces is doing I created to minidumps.

I managed to create two MemoryDumps 20 seconds apart. This is the output of the first:

!threadpool
CPU utilization 6%
Worker Thread: Total: 95 Running: 72 Idle: 23 MaxLimit: 200 MinLimit: 100
Work Request in Queue: 1
--------------------------------------
Number of Timers: 64

and the output of the second:

!threadpool
CPU utilization 9%
Worker Thread: Total: 111 Running: 111 Idle: 0 MaxLimit: 200 MinLimit: 100
Work Request in Queue: 1589

As you can see there are a lot of Request in Queue.

Question 1: what does it mean that there are 1589 requests in queue. Does it mean something is blocking?

The !threadpool list contains mostly these entries: Unknown Function: 6a2aa293 Context: 01cd1558 AsyncTimerCallbackCompletion TimerInfo@023a2cb0

If I you into depth with the AsyncTimerCallbackCompletion

!dumpheap -type TimerCallback

Then I look at the objects in the TimerCallback and most of them are of types:

System.Web.SessionState.SessionStateModule
System.Web.Caching.CacheCommon

Question 2: Does it make any sense that those Objects hava a timer, and so much? Should I prevent this. And how?

Main Question do I miss any obvious problems why I'm queueing connections and not maxing out the CPU?


I succeeded in making a crashdump during a peak. Analyzing it with debugdiag gave me this warning:

Detected possible blocking or leaked critical section at webengine!g_AppDomainLock owned by thread 65 in Hang Dump.dmp
Impact of this lock
25.00% of threads blocked
(Threads 11 20 29 30 31 32 33 39 40 41 42 74 75 76 77 78 79 80 81 82 83)

The following functions are trying to enter this critical section
webengine!GetAppDomain+c9

The following module(s) are involved with this critical section
\\?\C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\webengine.dll from Microsoft Corporation

A quick google search doesn't give me any results. Does somebody has a clue?

like image 753
wasigh Avatar asked Nov 19 '10 16:11

wasigh


2 Answers

I know this is an old thread but it's one of the first Google hits for people with poor ASP.NET site performance. So I will throw out a few recommendations:

1) Asynchronous Programming will solve the root cause. While you're calling out to a webservice to do your actual business logic, those request threads are just sitting there waiting on the response. They could be used instead to service another incoming request. This will reduce your Queue Length dramatically if not eliminate it entirely. Asynchronous programming is about scalability, not individual request performance. This is achieved quite easy in .NET 4.5 with the Async/Await pattern. ASP.NET injects threads at a rate of 2 per minute, so unless you are re-using those existing threads, you're going to quickly run out with the site load you are receiving. In addition, spinning up more threads is a small performance hit; it takes up more RAM and time to allocate that RAM. Just increasing the thread pool size in the machine.config won't fix the underlying problem. Unless you add more CPUs, adding more threads won't really help since it's still a misallocation of resources and you can also context-switch yourself to death by having too many threads and too little CPU.

2) From a popular article on threading in IIS 7.5: If your ASP.NET application is using web services (WFC or ASMX) or System.Net to communicate with a backend over HTTP you may need to increase connectionManagement/maxconnection. For ASP.NET applications, this is limited to 12 * #CPUs by the autoConfig feature. This means that on a quad-proc, you can have at most 12 * 4 = 48 concurrent connections to an IP end point. Because this is tied to autoConfig, the easiest way to increase maxconnection in an ASP.NET application is to set System.Net.ServicePointManager.DefaultConnectionLimit programatically, from Application_Start, for example. Set the value to the number of concurrent System.Net connections you expect your application to use. I've set this to Int32.MaxValue and not had any side effects, so you might try that--this is actually the default used in the native HTTP stack, WinHTTP. If you're not able to set System.Net.ServicePointManager.DefaultConnectionLimit programmatically, you'll need to disable autoConfig , but that means you also need to set maxWorkerThreads and maxIoThreads. You won't need to set minFreeThreads or minLocalRequestFreeThreads if you're not using classic/ISAPI mode.

3) You should really look at load-balancing if you're getting 20k unique visitors per hour. If every user did 10-20 AJAX requests per hour, you're easily talking about 1 million or more web service calls to your backend. Throwing up another server would reduce the load on the primary server. Combining this with async/await, and you've put yourself in a good situation where you can easily throw hardware at the problem (scaling out). There are multiple benefits here such as hardware redundancy, geolocation, and also performance. If you're using a cloud provider such as AWS or RackSpace, spinning up another VM with your app on it is easy enough that it can be done from your mobile phone. Cloud computing is too cheap nowadays to even have a queue length at all. You could do this to provide the performance benefits even before you make the switch to an asynchronous programming model.

4) Scaling Up: adding more hardware to your server(s) help because it providers better stability when you have additional threads. More threads means you need more CPUs and RAM. And even after you've gotten async/await under your belt, you'll still want to fine-tune those web service requests if you can. This could mean adding in a caching layer or beefing up your database system. You do NOT want to maximize the CPU on that single server. Once the CPU reaches 80%, ASP.NET will stop injecting more threads into the system. It doesn't matter if the worker process is sitting at 0%, if the overall system CPU utilization as reported by Task Manager reaches 80%, then thread injection stops and requests begin to queue. Weird things with garbage collection also happens when it detects a high CPU load on the server.

like image 125
Tim P. Avatar answered Nov 15 '22 21:11

Tim P.


The worker processes handling the queue was the real dealbreaker. Probably connected with the website calling webservices on the same host. Thus creating a kind of deadlock.

I changed the machine.config to to following:

<processModel
        autoConfig="false"
        maxWorkerThreads="100"
        maxIoThreads="100"
        minWorkerThreads="50"
        minIoThreads="50" />

Standard this processModel is set to autoConfig="true"

With the new config the webserver is handling the requests fast enough to not get queued.

like image 28
wasigh Avatar answered Nov 15 '22 23:11

wasigh