OS: GNU/Linux Distro: OpenSuSe 13.1 Arch: x86-64 GDB version: 7.6.50.20130731-cvs Program language: mostly C with minor bits of assembly Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after <code>open(2)</code> syscall returns -1? Of course, I can grep through the source code and find all <code>open(2)</code> invocations and narrow down the faulting <code>open()</code> call but maybe there's a better way. I tried to use <code>"catch syscall open"</code> then <code>"condition N if $rax==-1"</code> but obviously it didn't get hit. BTW, Is it possible to distinct between a call to syscall (e.g. <code>open(2)</code>) and return from syscall (e.g. <code>open(2)</code>) in GDB? As a current workaround I do the following: <ol> <li>Run the program in question under the GDB</li> <li> From another terminal launch systemtap script: <pre class="prettyprint"><code>stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }' </code></pre> </li> <li>After <code>open(2)</code> returns -1 I receive SIGSTOP in GDB session and I can debug the issue. </li> </ol> TIA. Best regards, alexz. UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround: <ol> <li>I still can't distinct between call and return from syscall</li> <li> If I use <code>finish</code> in <code>comm</code> I can't use <code>continue</code>, which is OK according to GDB docs i.e. the following does drop to gdb prompt on each break: <pre class="prettyprint"><code>gdb> comm gdb> finish gdb> printf "rax is %d\n",$rax gdb> cont gdb> end </code></pre> </li> <li>Actually I can avoid using <code>finish</code> and check %rax in <code>commands</code> but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" - then for -2. It's just simply not right</li> <li> So the only way to make it work for me was to define custom function and use it in the following way: <pre class="prettyprint"><code>(gdb) catch syscall open Catchpoint 1 (syscall 'open' [2] (gdb) define mycheck Type commands for definition of "mycheck". End with a line saying just "end". >finish >finish >if ($rax != -1) >cont >end >printf "rax is %d\n",$rax >end (gdb) comm Type commands for breakpoint(s) 1, one per line. End with a line saying just "end". >mycheck >end (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/alexz/gdb_syscall_test/main ..... Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6 0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24 24 fd = open(filenames[i], O_RDONLY); Opening test1 fd = 3 (0x3) Successfully opened test1 Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6 rax is -38 Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6 0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24 ---Type <return> to continue, or q <return> to quit--- 24 fd = open(filenames[i], O_RDONLY); rax is -1 (gdb) bt #0 0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24 (gdb) step 26 printf("Opening %s\n", filenames[i]); (gdb) info locals i = 1 fd = -1 </code></pre> </li> </ol>

<blockquote> Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1? </blockquote> It's hard to do better than <code>n.m.</code>s answer for this narrow question, but I would argue that the question is posed incorrectly. <blockquote> Of course, I can grep through the source code and find all open(2) invocations </blockquote> That is part of your confusion: when you call <code>open</code> in a C program, you are not in fact executing <code>open(2)</code> system call. Rather, you are invoking an <code>open(3)</code> "stub" from your libc, and that stub will execute the <code>open(2)</code> system call for you. And if you want to set a breakpoint when the stub is about to return <code>-1</code>, that is very easy. Example: <pre class="prettyprint"><code>/* t.c */ #include <sys/stat.h> #include <fcntl.h> int main() { int fd = open("/no/such/file", O_RDONLY); return fd == -1 ? 0 : 1; } $ gcc -g t.c; gdb -q ./a.out (gdb) start Temporary breakpoint 1 at 0x4004fc: file t.c, line 6. Starting program: /tmp/a.out Temporary breakpoint 1, main () at t.c:6 6 int fd = open("/no/such/file", O_RDONLY); (gdb) s open64 () at ../sysdeps/unix/syscall-template.S:82 82 ../sysdeps/unix/syscall-template.S: No such file or directory. </code></pre> Here we've reached the glibc system call stub. Let's disassemble it: <pre class="prettyprint"><code>(gdb) disas Dump of assembler code for function open64: => 0x00007ffff7b01d00 <+0>: cmpl $0x0,0x2d74ad(%rip) # 0x7ffff7dd91b4 <__libc_multiple_threads> 0x00007ffff7b01d07 <+7>: jne 0x7ffff7b01d19 <open64+25> 0x00007ffff7b01d09 <+0>: mov $0x2,%eax 0x00007ffff7b01d0e <+5>: syscall 0x00007ffff7b01d10 <+7>: cmp $0xfffffffffffff001,%rax 0x00007ffff7b01d16 <+13>: jae 0x7ffff7b01d49 <open64+73> 0x00007ffff7b01d18 <+15>: retq 0x00007ffff7b01d19 <+25>: sub $0x8,%rsp 0x00007ffff7b01d1d <+29>: callq 0x7ffff7b1d050 <__libc_enable_asynccancel> 0x00007ffff7b01d22 <+34>: mov %rax,(%rsp) 0x00007ffff7b01d26 <+38>: mov $0x2,%eax 0x00007ffff7b01d2b <+43>: syscall 0x00007ffff7b01d2d <+45>: mov (%rsp),%rdi 0x00007ffff7b01d31 <+49>: mov %rax,%rdx 0x00007ffff7b01d34 <+52>: callq 0x7ffff7b1d0b0 <__libc_disable_asynccancel> 0x00007ffff7b01d39 <+57>: mov %rdx,%rax 0x00007ffff7b01d3c <+60>: add $0x8,%rsp 0x00007ffff7b01d40 <+64>: cmp $0xfffffffffffff001,%rax 0x00007ffff7b01d46 <+70>: jae 0x7ffff7b01d49 <open64+73> 0x00007ffff7b01d48 <+72>: retq 0x00007ffff7b01d49 <+73>: mov 0x2d10d0(%rip),%rcx # 0x7ffff7dd2e20 0x00007ffff7b01d50 <+80>: xor %edx,%edx 0x00007ffff7b01d52 <+82>: sub %rax,%rdx 0x00007ffff7b01d55 <+85>: mov %edx,%fs:(%rcx) 0x00007ffff7b01d58 <+88>: or $0xffffffffffffffff,%rax 0x00007ffff7b01d5c <+92>: jmp 0x7ffff7b01d48 <open64+72> End of assembler dump. </code></pre> Here you can see that the stub behaves differently depending on whether the program has multiple threads or not. This has to do with asynchronous cancellation. There are two syscall instructions, and in the general case we'd need to set a breakpoint after each one (but see below). But this example is single-threaded, so I can set a single conditional breakpoint: <pre class="prettyprint"><code>(gdb) b *0x00007ffff7b01d10 if $rax < 0 Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82. (gdb) c Continuing. Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82 82 in ../sysdeps/unix/syscall-template.S (gdb) p $rax $1 = -2 </code></pre> Voila, the <code>open(2)</code> system call returned <code>-2</code>, which the stub will translate into setting <code>errno</code> to <code>ENOENT</code> (which is 2 on this system) and returning <code>-1</code>. If the <code>open(2)</code> succeeded, the condition <code>$rax < 0</code> would be false, and GDB will keep going. That is precisely the behavior one usually wants from GDB when looking for one failing system call among many succeeding ones. Update: As Chris Dodd points out, there are two syscalls, but on error they both branch to the same error-handling code (the code that sets <code>errno</code>). Thus, we can set an un-conditional breakpoint on <code>*0x00007ffff7b01d49</code>, and that breakpoint will fire only on failure. This is much better, because conditional breakpoints slow down execution quite a lot when the condition is false (GDB has to stop the inferior, evaluate the condition, and resume the inferior if the condition is false).

How can I set breakpoint in GDB for open(2) syscall returning -1

Tags:

OS: GNU/Linux
Distro: OpenSuSe 13.1
Arch: x86-64
GDB version: 7.6.50.20130731-cvs
Program language: mostly C with minor bits of assembly

Imagine that I've got rather big program that sometimes fails to open a file. Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

Of course, I can grep through the source code and find all open(2) invocations and narrow down the faulting open() call but maybe there's a better way.

I tried to use "catch syscall open" then "condition N if $rax==-1" but obviously it didn't get hit.
BTW, Is it possible to distinct between a call to syscall (e.g. open(2)) and return from syscall (e.g. open(2)) in GDB?

As a current workaround I do the following:

Run the program in question under the GDB

From another terminal launch systemtap script:

stap -g -v -e 'probe process("PATH to the program run under GDB").syscall.return { if( $syscall == 2 && $return <0) raise(%{ SIGSTOP %}) }'

After open(2) returns -1 I receive SIGSTOP in GDB session and I can debug the issue.

TIA.

Best regards,
alexz.

UPD: Even though I tried the approach suggested by n.m before and wasn't able to make it work I decided to give it another try. After 2 hours it now works as intended. But with some weird workaround:

I still can't distinct between call and return from syscall
If I use finish in comm I can't use continue, which is OK according to GDB docs
i.e. the following does drop to gdb prompt on each break:
```
gdb> comm
gdb> finish
gdb> printf "rax is %d\n",$rax
gdb> cont
gdb> end
```
Actually I can avoid using finish and check %rax in commands but in this case I have to check for -errno rather than -1 e.g. if it's "Permission denied" then I have to check for "-13" and if it's "No such file or direcory" - then for -2. It's just simply not right

So the only way to make it work for me was to define custom function and use it in the following way:

(gdb) catch syscall open
Catchpoint 1 (syscall 'open' [2]
(gdb) define mycheck
Type commands for definition of "mycheck".
End with a line saying just "end".
>finish
>finish
>if ($rax != -1)
 >cont
 >end
>printf "rax is %d\n",$rax
>end
(gdb) comm
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>mycheck
>end
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/alexz/gdb_syscall_test/main
.....
Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
24                      fd = open(filenames[i], O_RDONLY);
Opening test1
fd = 3 (0x3)
Successfully opened test1

Catchpoint 1 (call to syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
rax is -38

Catchpoint 1 (returned from syscall open), 0x00007ffff7b093f0 in __open_nocancel () from /lib64/libc.so.6
0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
---Type <return> to continue, or q <return> to quit---
24                      fd = open(filenames[i], O_RDONLY);
rax is -1
(gdb) bt
#0  0x0000000000400756 in main (argc=1, argv=0x7fffffffdb18) at main.c:24
(gdb) step
26                      printf("Opening %s\n", filenames[i]);
(gdb) info locals
i = 1
fd = -1

847

asked Sep 22 '14 11:09

Alex Z

2 Answers

This gdb script does what's requested:

set $outside = 1
catch syscall open
commands
  silent
  set $outside = ! $outside
  if ( $outside && $rax >= 0)
    continue
  end
  if ( !$outside )
    continue
  end
  echo `open' returned a negative value\n
end

The $outside variable is needed because gdb stops both at syscall enter and syscall exit. We need to ignore enter events and check $rax only at exit.

162

answered Oct 11 '22 21:10

n. 1.8e9-where's-my-share m.

Is it possible to set breakpoint in GDB in such way that it stops after open(2) syscall returns -1?

It's hard to do better than n.m.s answer for this narrow question, but I would argue that the question is posed incorrectly.

Of course, I can grep through the source code and find all open(2) invocations

That is part of your confusion: when you call open in a C program, you are not in fact executing open(2) system call. Rather, you are invoking an open(3) "stub" from your libc, and that stub will execute the open(2) system call for you.

And if you want to set a breakpoint when the stub is about to return -1, that is very easy.

Example:

/* t.c */
#include <sys/stat.h>
#include <fcntl.h>

int main()
{
  int fd = open("/no/such/file", O_RDONLY);
  return fd == -1 ? 0 : 1;
}

$ gcc -g t.c; gdb -q ./a.out
(gdb) start
Temporary breakpoint 1 at 0x4004fc: file t.c, line 6.
Starting program: /tmp/a.out

Temporary breakpoint 1, main () at t.c:6
6     int fd = open("/no/such/file", O_RDONLY);
(gdb) s
open64 () at ../sysdeps/unix/syscall-template.S:82
82  ../sysdeps/unix/syscall-template.S: No such file or directory.

Here we've reached the glibc system call stub. Let's disassemble it:

(gdb) disas
Dump of assembler code for function open64:
=> 0x00007ffff7b01d00 <+0>: cmpl   $0x0,0x2d74ad(%rip)        # 0x7ffff7dd91b4 <__libc_multiple_threads>
   0x00007ffff7b01d07 <+7>: jne    0x7ffff7b01d19 <open64+25>
   0x00007ffff7b01d09 <+0>: mov    $0x2,%eax
   0x00007ffff7b01d0e <+5>: syscall
   0x00007ffff7b01d10 <+7>: cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d16 <+13>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d18 <+15>:    retq
   0x00007ffff7b01d19 <+25>:    sub    $0x8,%rsp
   0x00007ffff7b01d1d <+29>:    callq  0x7ffff7b1d050 <__libc_enable_asynccancel>
   0x00007ffff7b01d22 <+34>:    mov    %rax,(%rsp)
   0x00007ffff7b01d26 <+38>:    mov    $0x2,%eax
   0x00007ffff7b01d2b <+43>:    syscall
   0x00007ffff7b01d2d <+45>:    mov    (%rsp),%rdi
   0x00007ffff7b01d31 <+49>:    mov    %rax,%rdx
   0x00007ffff7b01d34 <+52>:    callq  0x7ffff7b1d0b0 <__libc_disable_asynccancel>
   0x00007ffff7b01d39 <+57>:    mov    %rdx,%rax
   0x00007ffff7b01d3c <+60>:    add    $0x8,%rsp
   0x00007ffff7b01d40 <+64>:    cmp    $0xfffffffffffff001,%rax
   0x00007ffff7b01d46 <+70>:    jae    0x7ffff7b01d49 <open64+73>
   0x00007ffff7b01d48 <+72>:    retq
   0x00007ffff7b01d49 <+73>:    mov    0x2d10d0(%rip),%rcx        # 0x7ffff7dd2e20
   0x00007ffff7b01d50 <+80>:    xor    %edx,%edx
   0x00007ffff7b01d52 <+82>:    sub    %rax,%rdx
   0x00007ffff7b01d55 <+85>:    mov    %edx,%fs:(%rcx)
   0x00007ffff7b01d58 <+88>:    or     $0xffffffffffffffff,%rax
   0x00007ffff7b01d5c <+92>:    jmp    0x7ffff7b01d48 <open64+72>
End of assembler dump.

Here you can see that the stub behaves differently depending on whether the program has multiple threads or not. This has to do with asynchronous cancellation.

There are two syscall instructions, and in the general case we'd need to set a breakpoint after each one (but see below).

But this example is single-threaded, so I can set a single conditional breakpoint:

(gdb) b *0x00007ffff7b01d10 if $rax < 0
Breakpoint 2 at 0x7ffff7b01d10: file ../sysdeps/unix/syscall-template.S, line 82.
(gdb) c
Continuing.

Breakpoint 2, 0x00007ffff7b01d10 in __open_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  in ../sysdeps/unix/syscall-template.S
(gdb) p $rax
$1 = -2

Voila, the open(2) system call returned -2, which the stub will translate into setting errno to ENOENT (which is 2 on this system) and returning -1.

If the open(2) succeeded, the condition $rax < 0 would be false, and GDB will keep going.

That is precisely the behavior one usually wants from GDB when looking for one failing system call among many succeeding ones.

Update:

As Chris Dodd points out, there are two syscalls, but on error they both branch to the same error-handling code (the code that sets errno). Thus, we can set an un-conditional breakpoint on *0x00007ffff7b01d49, and that breakpoint will fire only on failure.

This is much better, because conditional breakpoints slow down execution quite a lot when the condition is false (GDB has to stop the inferior, evaluate the condition, and resume the inferior if the condition is false).

answered Oct 11 '22 22:10

Employed Russian

Related questions
                            
                                iOS 8, How to use device for development?
                            
                                XCode 6 and Ad-Hoc distribution without XC: provisioning
                            
                                iOS 8 popoverpresentationcontroller popoverlayoutmargin not working
                            
                                Which function in spark is used to combine two RDDs by keys
                            
                                MinGW doesn't produce warnings
                            
                                spring boot with spring security : Error creating bean with name 'securityFilterChainRegistration'
                            
                                Behavior of F# "unmanaged" type constraint
                            
                                Cannot access field from static context when passing value to superconstructor
                            
                                How to hide navigation bar back button?
                            
                                Angular Material: how to set background-color (without CSS)
                            
                                How do you set the IIS Application Pool Identity User Locale when it's set to ApplicationPoolIdentity
                            
                                Azure SQL Database pricing is per database server or per user-created database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With