I'm looking for a way to debug a rare Delphi 7 critical section (TCriticalSection) hang/deadlock. In this case, if a thread is waiting on a critical section for more than say 10 seconds, I'd like to produce a report with the stack trace of both the thread currently locking the critical section and also the thread that failed to be able to lock the critical section after waiting 10 seconds. It is OK then if an exception is raised or the Application terminates.
I would prefer to continue using critical sections, rather than using other synchronization primitives, if possible, but can switch if necessary (such as to get a timeout feature).
If the tool/method works at runtime outside of the IDE, that is a bonus, since this is hard to reproduce on demand. In the rare case I can duplicate the deadlock inside the IDE, if I try to Pause to start debugging, the IDE just sits there doing nothing, and never gets to a state where I can view threads or call stacks. I can Reset the running program, though.
Update: In this case, I'm only dealing with one critical section and 2 threads, so this likely isn't a lock ordering problem. I believe there is an improper nested attempt to enter the lock across two different threads, which results in deadlock.
You should create and use your own lock object class. It can be implemented using critical sections or mutexes, depending on whether you want to debug this or not.
Creating your own class has an added benefit: You can implement a locking hierarchy and raise an exception when it is violated. Deadlocks happen when locks are not taken in exactly the same order, every time. Assigning a lock level to each lock makes it possible to check that the locks are taken in the correct order. You could store the current lock level in a threadvar, and allow only locks to be taken that have a lower lock level, otherwise you raise an exception. This will catch all violations, even when no deadlock happens, so it should speed up your debugging a lot.
As for getting the stack trace of the threads, there are many questions here on Stack Overflow dealing with this.
Update
You write:
In this case, I'm only dealing with one critical section and 2 threads, so this likely isn't a lock ordering problem. I believe there is an improper nested attempt to enter the lock across two different threads, which results in deadlock.
That can't be the whole story. There's no way to deadlock with two threads and a single critical section alone on Windows, because critical sections can be acquired there recursively by a thread. There has to be another blocking mechanism involved, like for example the SendMessage()
call.
But if you really are dealing with two threads only, then one of them has to be the main / VCL / GUI thread. In that case you should be able to use the MadExcept "Main thread freeze checking" feature. It will try to send a message to the main thread, and fail after a customizable time has elapsed without the message being handled. If your main thread is blocking on the critical section, and the other thread is blocking on a message handling call then MadExcept should be able to catch this and give you a stack trace for both threads.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With