Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find out why my thread is being stopped in ASP.NET?

Our logs are reporting ThreadAbortExceptions that are stopping our Quartz.NET jobs at seemingly random intervals. From what I understand, this wouldn't normally be caused by something the thread itself is doing (e.g. reading a file from an FTP server, or executing a LINQ to Entities query), but rather because some outside process is telling the thread to stop. Furthermore, the way the logs are being created leads me to believe that the entire web application is being restarted when we get these errors, so I'm guessing that the restart process is what's causing the thread to be aborted in the first place.

So my question is: how can I figure out why the server/application is being restarted? Are there logs somewhere that would give me details on each restart? Are there common causes for something like this that I should investigate?

Thanks in advance for your help.

Edit

I just had a discussion with some co-workers, and it sounds like IIS automatically puts the application to sleep after a certain period of inactivity, which might be part of the problem. With some research, I've found an "Idle Timeout" setting for worker threads in IIS. I think that when the application hasn't processed any requests for a certain amount of time, it issues a shutdown command. For some reason Quartz doesn't shut down immediately, but instead it waits for the next job to get fired, and then the system detects that job's thread and kills it while it's trying to run.

So I guess we need to come up with some way to gracefully finish any running jobs when the system wants to shut down, and make Quartz actually shut down when it's told to, if it's not running any jobs. Does anybody have any experience wit this sort of issue?

like image 761
StriplingWarrior Avatar asked Dec 03 '10 16:12

StriplingWarrior


3 Answers

As liho1eye pointed out, the problem arose from the application pool shutting down our application. For some reason, Quartz apparently wasn't shutting down immediately. Instead, it waited until the next job ran and shut down then, which meant that the running job had to get shut down via ThreadAbordException.

Our solution to this was two-fold. First, we updated Quartz to a more recent version, which seemed to make it behave a little better. Second, in our Application_End method in Global.asax.cs, we added a call to Scheduler.Shutdown(true). This tells the scheduler to stop firing additional triggers, and then it waits until all the currently-running triggers complete before allowing the application to end.

like image 177
StriplingWarrior Avatar answered Nov 01 '22 01:11

StriplingWarrior


Naturally this means that something somewhere called Thread.Abort() on the instance of your job thread. I would look towards this Quartz thing for explanation.

Another possibility is that your job thread is a background thread and your app pool is being recycled, but I would know anything about this Quartz thing to tell for sure.

like image 43
Ilia G Avatar answered Nov 01 '22 02:11

Ilia G


If you perform any redirects in your code without specifying the endReponse parameter of Response.Redirect, the redirect will call thread.Abort(), but there will still be code to execute. This code gets orphaned since the thread is gone and you get the exception. For reading:

http://www.c6software.com/articles/ThreadAbortException.aspx

Edit:
Another possibility would be an unhandled server level exception that causes the w3wp.exe process to crash or recycle itself. This would be the external cause you alluded to that would cause the thread to abort but attempt to continue running code. To determine if this might be the case, you would have exceptions in your System Event Log. They're very generic, but they'll clearly list w3wp.exe (so you can use that as a filter). If this turns out to be the case, you'll need to install IIS Debug Diagnostics and set up some crash monitors to catch what is going on at the moment of the crash. Since it happens outside of the actual page lifecycle, normal exception handling gets bypassed.

like image 1
Joel Etherton Avatar answered Nov 01 '22 03:11

Joel Etherton