Threads in .NET are paused during GC. How can a thread be paused safely by the CLR? What could be a risk by stopping thread in brutal way like Win32 SuspendThread API?
This is about .NET Compact Framework, but I think is is correct for normal .NET: http://blogs.msdn.com/b/abhinaba/archive/2009/09/02/netcf-gc-and-thread-blocking.aspx
Before actually running GC the CLR tries to go into a “safe point”. Each thread has a suspend event associated with it and this event is checked by each thread regularly. Before starting GC the CLR enumerates all managed threads and in each of them sets this event. In the next point when the thread checks and finds this event set, it blocks waiting for the event to get reset (which happens when GC is complete).
In general, a thread 1 can safely pause a thread 2, by setting a "please suspend now" flag that executing code in thread 2 is guaranteed to test in a bounded period of time, and later suspend itself a safe point in its execution.
In the .NET CLR (it used to be the case anyway) that a thread could be in cooperative GC mode or preemptive mode. In cooperative mode, e.g. when running managed code methods, the thread occasionally explicitly checks whether it should yield.
In contrast, in preemptive mode, e.g. when running p-invoked native code, and sometimes when in the CLR itself (MSCORWKS.DLL etc.), another thread (including another CLR thread needing to GC) can unilaterally suspend the first thread at any instruction boundary.
Indeed, these can be combined. The CLR GC may set a "please suspend now" flag, and then wait a while, in case a pre-emptive mode thread returns back to cooperative mode (which then tests the flag and suspend its thread). If, after waiting, the preemptive mode thread still hasn't suspended, the CLR (on another thread) might unilaterally suspend it.
Running a .NET method in cooperative mode, the generated native code depends upon the cooperative mode non-preemption guarantees to most efficiently operate upon the object memory (GC heap etc.). If you were to externally and "brutally" suspend a cooperative GC thread at an arbitrary point in its execution, it could leave the object memory or other CLR internals in a state that might not be suitable for a correct garbage collection cycle or other CLR operations. But this never happens in practice because that is not how cooperative mode threads are paused.
(There are obviously corner cases such as when the cooperative mode thread comletes its OS thread scheduler quantum, and OS context switches that core to some other thread. Since the thread is no longer running, it is not polling the yield flag, and won't yield immediately. I don't recall what happens in such cases -- whether the CLR has to wait until it is rescheduled, tests the flag, and yields, or whether it suspends the thread anyway and then inspects its IP to see if it was in a safe place. Anyone?)
I believe this is described in the SSCLI (Rotor) internals.
Makes sense?
Happy hacking!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With