I am writing a GUI oriented debugger which targets Linux primarily, but I plan ports to other OSes in the future. Because the GUI must stay interactive at all times, I have a few threads handling different things.
Primarily I have a "debug event" thread which simply loops waiting for waitpid to return and delivers the received events to the other threads. I do this because waitpid does not have a timeout, which makes it very hard to integrate it with other event loops and keep things responsive (waitpid can hang indefinitely!).
This strategy has worked wonderfully for the Linux builds so far. Lately I've been trying to make my debugger thread aware (as in the threads in the debugged application, not the debugger itself).
So I set the ptrace options to follow clone events and look for a status which has the upper 16-bit set to PTRACE_EVENT_CLONE. Then I use PTRACE_GETEVENTMSG to get the TID of the new thread. This all works nicely in my small test harness applications. But for some reason, it is failing when i put that code in my actual debugger. (I get a "No such process" error code)
The one thing that occurred to me is that Windows has a rule that only the thread which attached to an application can listen for debug events. Does Linux's ptrace have a similar limitation? If so, why does my code work for other debug events?
EDIT:
It seems that at the very least waitpid supports waiting from a different thread, the man page says:
Before Linux 2.4, a thread was just a special case of a process, and as a consequence one thread could not wait on the children of another thread, even when the latter belongs to the same thread group. However, POSIX prescribes such functionality, and since Linux 2.4 a thread can, and by default will, wait on children of other threads in the same thread group.
So at most this is a ptrace limitation.
I had the same issue (plus many others!) while implementing the Linux-specific part of the Maxine VM debugger. You are correct in your guess that only one thread in the debugger can use ptrace to control the debuggee. We accomplish this by making all calls to ptrace on a dedicated thread. You may find it useful to look at the LinuxTask.java, linuxTask.h and linuxTask.c files in the Maxine sources available at kenai.com/projects/maxine/sources/maxine/show
As far as I can tell, this is not allowed. A task cannot use ptrace on a task which it has not attached.  Also, a task can be traced by at most one other task, so you can't simply attach it once in each thread. I think this is because when one task attaches to another task, the tracing task becomes the parent of the traced task, and each task can only have one parent.
It seems like multi-thread tracing ought to be allowed because the threads are part of the same process, but implementation-wise, there isn't actually much distinction between threads and processes in the Linux kernel. A thread is just a task that happens to share most of its resources with another task.
If you're interested, you can browse the source code for ptrace in the kernel. Specifically look at ptrace_check_attach, which is called by sys_ptrace for most requests. It returns -ESRCH (sounds like the error code you're getting) if the target task's parent is not the current task. 
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With