Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging RtlUserThreadStart in Process Explorer

I have a multi-threaded wpf application built on 3.5. When I look at the running threads through Process Explorer I see 8 threads all with the same start address, ntdll.dll!RtlUserThreadStart and all eight have a CPU value from 3-6+ and have a high Cycles Delta. I can't figure out what these threads are doing. It is always the same threads. It never varies within the same instance of the application. When I debug my application at the same time and pause the debugger, all these threads are showing a single line for the stack either System.Threading.ConcurrencyScheduler.Scheduler.WaitForWork() or System.Threading.Monitor.Wait().

I enabled the symbol files for Visual Studio and I see the following stack on those threads:

System.Threading.Monitor.Wait() Normal
mscorlib.dll!System.Threading.Monitor.Wait(object obj, int millisecondsTimeout) + 0x19     bytes
System.Threading.dll!System.Threading.ConcurrencyScheduler.Scheduler.WaitForWork() + 0xd0 bytes  
System.Threading.dll!System.Threading.ConcurrencyScheduler.InternalContext.Dispatch() + 0x74a bytes
System.Threading.dll!System.Threading.ConcurrencyScheduler.ThreadInternalContext.ThreadStartBridge(System.IntPtr dummy) + 0x9f bytes     

When I look at the stack provided on the thread within process monitor I see the following as examples:

0  ntoskrnl.exe!KeWaitForMultipleObjects+0xc0a
1  ntoskrnl.exe!KeAcquireSpinLockAtDpcLevel+0x732
2  ntoskrnl.exe!KeWaitForSingleObject+0x19f
3  ntoskrnl.exe!_misaligned_access+0xba4
4  ntoskrnl.exe!_misaligned_access+0x1821
5  ntoskrnl.exe!_misaligned_access+0x1a97
6  mscorwks.dll!InitializeFusion+0x990b
7  mscorwks.dll!DeleteShadowCache+0x31ef

or:

0  ntoskrnl.exe!KeWaitForMultipleObjects+0xc0a
1  ntoskrnl.exe!KeAcquireSpinLockAtDpcLevel+0x732
2  ntoskrnl.exe!KeWaitForSingleObject+0x19f
3  ntoskrnl.exe!_misaligned_access+0xba4
4  ntoskrnl.exe!_misaligned_access+0x1821
5  ntoskrnl.exe!KeAcquireSpinLockAtDpcLevel+0x93d
6  ntoskrnl.exe!KeWaitForMultipleObjects+0x26a
7  ntoskrnl.exe!NtWaitForSingleObject+0x41f
8  ntoskrnl.exe!NtWaitForSingleObject+0x78e
9  ntoskrnl.exe!KeSynchronizeExecution+0x3a23
10 ntdll.dll!ZwWaitForMultipleObjects+0xa
11 KERNELBASE.dll!GetCurrentProcess+0x40
12 KERNEL32.dll!WaitForMultipleObjectsEx+0xb3
13 mscorwks.dll!CreateApplicationContext+0x10499
14 mscorwks.dll!CreateApplicationContext+0xbc41
15 mscorwks.dll!StrongNameFreeBuffer+0xc54d
16 mscorwks.dll!StrongNameFreeBuffer+0x2ac48
17 mscorwks.dll!StrongNameTokenFromPublicKey+0x1a5ea
18 mscorwks.dll!CopyPDBs+0x17362
19 mscorwks.dll!CorExitProcess+0x3dc9
20 mscorwks.dll!TranslateSecurityAttributes+0x547f
21 mscorlib.ni.dll+0x8e6bc9

As an additional note to this item. My computer is a single CPU with 4 cores. When we run the same app on a dual CPU with 4 cores we see this number of threads go from 8 to 16.

like image 756
Ben Avatar asked Jan 16 '23 06:01

Ben


1 Answers

Your question is woefully under-documented, but a reasonable guess is that you appear to use the PPL library. Which keeps a pool of threads around to get the parallel jobs done. You are no doubt seeing high cpu cycle counts because these threads are indeed doing the job you asked them to do.

As is typical with thread pools, the PPL keeps these threads around for the next job to do, that's why you see them waiting on WaitForWork(). The native stack traces are junk due to a lack of debugging symbols. RtlUserThreadStart is otherwise a Windows function you'll always see back in an unmanaged stack trace, that's how a thread gets started.

This is all entirely normal. The only other info worth of note is this answer posted by a Microsoft employee:

The concurrency runtime caches threads for later re-use. They are released only when all the concurrency runtime schedulers have been shutdown. (Typically, there is just a single default scheduler in the process). A scheduler is shutdown when all the external threads that queued work to it has exited. So if the main thread scheduled work (by calling parallel_for from main() say) then the default scheduler would be deleted only on process shutdown.

There is an upper limit on the number of cached threads. It is rougly 4 times the number of cores on the machine (though there are some other factors affecting the threshold like the stack size option in scheduler policies).

like image 165
Hans Passant Avatar answered Jan 29 '23 14:01

Hans Passant