Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

current logical threads increasing / thread stack is leaking

Monitoring my .NET app in Performance Monitor I can see the .NET CLR LocksAndThreads / # of current logical Threads is increasing steadily (currently 293) over time which indicates the thread stack is leaking.

I can find many articles which tell me this is the problem but nothing that tells me how to find the cause - so where do I start? Can Windbg tell me where the problem lies?

This is my performance monitor over 3hrs telling my current logical threads is 150:

thread leak

And this is the output of the threads window, which doesn't tell me much because I can't access their call stacks - they are mostly marked as [unavailable] or [In a sleep, wait or join] | [External Code]:

Unflagged       141024  124 Worker Thread   <No Name>       Normal
Unflagged   >   0   0   Unknown Thread  [Thread Destroyed]      
Unflagged       136272  2   Worker Thread   <No Name>       Highest
Unflagged       133060  7   Worker Thread   vshost.RunParkingWindow [Managed to Native Transition]  Normal
Unflagged       136952  10  Main Thread Main Thread [edited].Program.Main   Normal
Unflagged       134544  9   Worker Thread   .NET SystemEvents   [Managed to Native Transition]  Normal
Unflagged       136556  11  Worker Thread   Worker Thread   [edited].MessageService.ProcessJobs.AnonymousMethod__0  Normal
Unflagged       141364  113 Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       140896  0   Worker Thread   [Thread Destroyed]      Normal
Unflagged       136776  19  Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       135704  20  Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       136712  21  Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       134984  22  Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       134660  23  Worker Thread   Worker Thread   [edited].BroadcastService.ProcessJobs.AnonymousMethod__1d   Normal
Unflagged       140224  152 Worker Thread   <No Name>       Normal
Unflagged       140792  157 Worker Thread   <No Name>       Normal
Unflagged       137116  0   Worker Thread   <No Name>       Normal
Unflagged       140776  111 Worker Thread   <No Name>       Normal
Unflagged       140784  0   Worker Thread   [Thread Destroyed]      Normal
Unflagged       140068  145 Worker Thread   <No Name>       Normal
Unflagged       139000  150 Worker Thread   <No Name>       Normal
Unflagged       140828  52  Worker Thread   <No Name>       Normal
Unflagged       137752  146 Worker Thread   <No Name>       Normal
Unflagged       140868  151 Worker Thread   <No Name>       Normal
Unflagged       141324  139 Worker Thread   <No Name>       Normal
Unflagged       140168  154 Worker Thread   <No Name>       Normal
Unflagged       141848  0   Worker Thread   [Thread Destroyed]      Normal
Unflagged       135544  153 Worker Thread   <No Name>       Normal
Unflagged       142260  140 Worker Thread   <No Name>       Normal
Unflagged       141528  142 Worker Thread   <No Name>   [In a sleep, wait, or join] Normal
Unflagged       141344  0   Worker Thread   [Thread Destroyed]      Normal
Unflagged       140096  136 Worker Thread   <No Name>       Normal
Unflagged       141712  134 Worker Thread   <No Name>       Normal
Unflagged       141688  147 Worker Thread   <No Name>       Normal

Update: I've since tracked the culprit down to a System.Timers.Timer. Even when this timer called an empty method on each Elapsed event it still raised the logical thread count indefinitely. Just changing the timer to a DispatcherTimer has fixed the problem.

I started looking into all the timers in my application after seeing a large number when running !dumpheap -type TimerCallback in Windbg as mentioned in this question.

I'd still like to know how I could've detected this via Windbg debugging rather than the disable timers/check performance/repeat method that lead me to the fix. I.e. anything that could've told me which timer was creating the problem.

like image 904
DaveO Avatar asked Feb 21 '12 12:02

DaveO


1 Answers

This is typically caused by having thread-pool threads getting stuck and not completing. Every half a second, the threadpool manager allows another thread to start to try to work down the backlog. This keeps going until it reaches the maximum number of threads as set by ThreadPool.SetMaxThreads(). By default a huge number, 1000 on a 4-core machine.

Use Debug + Windows + Threads to look at the running threads. Their call stack should make it obvious why they are blocking.

like image 200
Hans Passant Avatar answered Nov 17 '22 19:11

Hans Passant