How to debug deadlock problems in kernel

Tags:

I have a buggy kernel module which I am trying to fix. Basically when this module is running, it will cause other tasks to hang for more than 120 seconds. Since almost all the hung tasks are waiting for either mm->mmap_sem or some file system locks (i_node->i_mutex) I suspect that it has something to do with this module doesn't not grab the mmap_sem lock and some file-system level lock (like inote->i_mutex) in order, which could have caused some deadlock problem. Since my module does not try to grab those locks directly though, I assume it is some function I called that grab those locks. And now I am trying to figure out which function calls in my module is causing the problem.

However, I am having a hard time debugging it for the following reasons:

I don't know exactly which lock the hung task is trying to grab. I got the call trace of the hung task, and know at what point it hangs. Kernel also gives me some kind of information like: "1 lock held by automount/3115: 0: (&type->i_mutex_dir_key#2){--..}, at: [] real_lookup+0x24/0xc5". However, I want to know exact which lock a task holds, and exactly which lock it is trying to acquire in order to figure out the problem. As kernel doesn't provide the arguments of function calls along with the call trace, I find this information difficult to obtain.
I am using gdb andvmware to debug this, which allows me to set breakpoints, step into a function and such. However, as which task and at what point that task will hang is kind of un-deterministic, I don't really know where to set breakpoints and inspect. It will be great if I can somehow "attach" to the task which kernel reported to be blocked for more than 120 secs, and get some information about it.

So my questions are as following:

Where can I get, along with the call trace, the arguments of the functions in the call trace, in order to figure out exactly which lock a task is trying to grab.
Is it possible for me to use gdb to somehow "attach" to a hung task in a kernel? If not, is there some way for me to at least examine the data structure which represents that task? As I am having a hard time examining all the global data structure in kernel too. GDB always complains that "can't access memory 0x3200" or something similar.
It would also be very helpful if I can print out for every task in the kernel, what locks they are currently holding. Is there a way to do it?

Thank you very much!

588

asked Feb 05 '12 05:02

yangsuli

2 Answers

Not answering your question directly, but hopefully this is more helpful - the Linux kernel has a built heavy duty lock validator called lockdep. Turn it on and let it run. If you have a lock order problem, it is likely to catch it and give you a detailed report.

See: http://www.mjmwired.net/kernel/Documentation/lockdep-design.txt

170

answered Sep 28 '22 20:09

gby

The kernel feature lockdep can help you in this regard. Check out my post on how to use it in your kernel: How to use lockdep feature in linux kernel for deadlock detection

answered Sep 28 '22 20:09

brokenfoot

Related questions
                            
                                How can I find out what commit(s) git bisect would try next?
                            
                                Portable .Net debugging tools
                            
                                Whats a good java debugger?
                            
                                How do I debug a published XBAP file in VS2010?
                            
                                Debug and Release configurations
                            
                                Silverlight debugging; not attaching process
                            
                                C# Exceptions only caught when debugging? [duplicate]
                            
                                What are some good JS debugging tools? [closed]
                            
                                Is there any way to debug web applications using Eclipse?
                            
                                Dump execution - java?
                            
                                Debugging and Logging in Lift Using SBT
                            
                                How can I disable Delphi as just-in-time debugger?
                            
                                Why do some properties go out of scope in the watch list, while others do not?
                            
                                Wireless Debugging in Xcode4.2
                            
                                Effective way to debug a Google Apps Script Web App
                            
                                Remote debugging Java 9 in a docker container from IntelliJ IDEA
                            
                                How can I debug a corrupt docx file?
                            
                                Java Creating a new ObjectInputStream Blocks
                            
                                Firefox - Disable 'debugger' keywords
                            
                                Is it possible to attach a debugging session to a running program in eclipse CDT

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to debug deadlock problems in kernel

Tags:

debugging

linux-kernel

kernel

linux-device-driver

yangsuli

People also ask

2 Answers

gby

brokenfoot

Recent Activity

Donate For Us