Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the processor know the breakpoints?

Lets consider this very simple program:

#include<stdio.h>

int main () 
{
    int num1=4, num2=5;
    printf("Welcome\n");
    printf("num1 + num2 = %d\n", num1+num2);
    return 0;
}

When looking on the generated assembly code using gcc -S prog.c:

    .file   "p.c"
    .def    ___main;    .scl    2;  .type   32; .endef
    .section .rdata,"dr"
LC0:
    .ascii "Welcome\0"
LC1:
    .ascii "num1 + num2 = %d\12\0"
    .text
    .globl  _main
    .def    _main;  .scl    2;  .type   32; .endef
_main:
LFB10:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    call    ___main
    movl    $4, 28(%esp)
    movl    $5, 24(%esp)
    movl    $LC0, (%esp)
    call    _puts
    movl    28(%esp), %edx
    movl    24(%esp), %eax
    addl    %edx, %eax
    movl    %eax, 4(%esp)
    movl    $LC1, (%esp)
    call    _printf
    call    _getchar
    movl    $0, %eax
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
LFE10:
    .ident  "GCC: (GNU) 5.3.0"
    .def    _puts;  .scl    2;  .type   32; .endef
    .def    _printf;    .scl    2;  .type   32; .endef
    .def    _getchar;   .scl    2;  .type   32; .endef

I know that the CPU see's the assembled code that the compiler generated for it, what I don't understand is how the program stops at the breakpoint that the user set ? why doesn't the CPU continue running the program ? what and how does that happens ? I mean, why would it stop after fetching an instruction ?

I am a bit confused about it, is it Code::Blocks caring for this or whatever program user is using ?

Thanks in advance!

like image 747
James1234 Avatar asked May 17 '17 21:05

James1234


Video Answer


1 Answers

Most modern instruction sets include a breakpoint exception, used to allow debuggers to insert breakpoints in a program's code by temporarily replacing the relevant program instruction with the special software interrupt instruction. On the x86/x86-64 ISA, this instruction is "interrupt vector 3" (aka int3), which is usually emitted as the single-byte instruction 0xcc.

An important thing to note about breakpoint instructions is that they generally must be at least as small as the smallest possible instruction on the ISA. There are a few reasons for this. Some ISAs require minimum alignments for instructions; a shorter instruction will typically have less strict alignment requirements. Additionally, replacing some instruction with a longer one means that you are likely to overwrite a later instruction. This might not be a big deal in single-threaded applications, but in multithreaded applications, it is a show-stopper. Consider, for example, what might happen if you replace a short instruction at the end of an optional branch with a longer one, and another running thread skips the branch.

In other cases, such a special instruction may not exist. On hardware platforms lacking a specific breakpoint instruction, special hardware registers are sometimes provided to cause the processor to trap when it attempts to access a particular location in memory. These registers are typically rather limited in number, so when debugging with numerous breakpoints, a dedicated breakpoint instruction is extremely useful.

When you start your program in a debugger and add a software-enabled breakpoint, what typically happens is something like the following:

The debugger loads the program into memory and provides you some input prompt. You tell the debugger to add a breakpoint. It may use some information to figure out where in memory your breakpoint actually corresponds to the in-memory representation of the program. The debugger then decodes the instruction at that address (because it generally wants to replace the whole instruction) and replaces it (in memory) with the breakpoint instruction. You then tell the debugger to execute / continue executing the program.

When the processor encounters this instruction, it generates a trap. This trap is delivered as an interrupt to the operating system, which notices that the trap is intended to debug the program. The OS knows which program is being executed (therefore also who is executing it) -- so it may do some permission checks to make sure the user debugging the application is actually allowed to do so at this point. If all looks good, the OS notifies the debugger that a breakpoint was encountered, and it tells you that it stopped.

This isn't a universal explanation. Significant OS support is required for the above to be true. On Linux and the BSDs, most of this functionality is exposed through the ptrace(2) syscall (which allows reading and replacing instructions, as well as single-stepping them). While POSIX-compliant, OS X does not implement ptrace(2) and instead provides various Mach ports for this. Windows has something else entirely.

On embedded systems, special hardware ports (like JTAG) might be supplied to allow introspection at a hardware level, allowing the development of an external debugger, which "speaks" directly to the hardware using JTAG.

like image 121
Tony Tannous Avatar answered Oct 26 '22 22:10

Tony Tannous