When every process has its own private memory space that no external process has access to, how does a debugger access a process' memory space?
For eg, I can attach gdb to a running process using gdb -p <pid>
The I can access all the memory of this process via gdb.
How is gdb able to do this?
I read the relevant questions in SO and no post seems to answer this point.
When every process has its own private memory space that no external process has access to ...
That's false. External processes with the correct permissions and using the correct APIs can access other process' memory.
Since the question is tagged Linux and Unix, I'll expand a little on what David Scwartz says, which in short is "there is an API for that in the OS". The same basic principle applies in Windows as well, but the actual implementation is different, and although I suspect the implementation inside the OS does the same thing, there's no REAL way to know that, since we can't inspect the source code for Windows (one can, however, understanding how an OS and a processor works, sort of figure out what must be happening!)
Linux has a function called ptrace
, that allows one process (following some checking of privileges) to inspect another process in various ways. It is one call, but the first parameter is a "what do you want to do". Here are some of the most basic examples - there are a couple of dozen others for less "common" operations:
PTRACE_ATTACH
- connect to the process. PTRACE_PEEKTEXT
- look at the attached process' code memory (for example to disassemble the code)PTRACE_PEEKDATA
- look at the attached process' data memory (to display variables)PTRACE_POKETEXT
- write to process' code memoryPTRACE_POKEDATA
- write to process' data memory.PTRACE_GETREGS
- copy the current register values. PTRACE_SETREGS
- change the current register values (e.g. a debug command of set variable x = 7
, if x
happens to be in a register) In Linux, since memory is "all the same", PTRACE_PEEKTEXT
and PTRACE_PEEKDATA
are actually the same functionality, so you can give an address in code for PTRACE_PEEKDATA
and an address, say, on the stack for PTRACE_PEEKTEXT
and it will perfectly happily copy that back for you. The distinction is made for OS/processor combinations where memory is "split" between DATA memory and CODE memory. Most modern OS's and processors do not make that distinction. Same obviously applies to PTRACE_POKEDATA
and PTRACE_POKETEXT
.
So, say that the "debugger process" uses:
long data = ptrace(PTRACE_PEEKDATA, pid, 0x12340128, NULL);
When the OS is called with a PTRACE_PEEKDATA
for address 0x12340128 it will "look" at the corresponding memory mapping for the memory at 0x12340128 (page-aligned that makes 0x12340000), if it exists, it will get mapped into the kernel, the data is then copied out from address 0x12340128 into the local memory, the memory unmapped, and the copied data passed back as the return value.
The manual states the initiating of the usage as:
The parent can initiate a trace by calling fork(2) and having the resulting child do a PTRACE_TRACEME, followed (typically) by an exec(3). Alternatively, the parent may commence trace of an existing process using PTRACE_ATTACH.
For several pages more information do man ptrace
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With