Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

std::bind and stack-use-after-scope

Tags:

c++

c++11

So, today I was running some code built with Address Sanitizer and have stumbled upon a strange stack-use-after-scope bug. I have this simplified example:

#include <functional>
class k
{
public: operator int(){return 5;}
};

const int& n(const int& a)
{
  return a;
}

int main()
{
  k l;
  return std::bind(n, l)();
}

ASAN complains about the last code line:

==27575==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffeab375210 at pc 0x000000400a01 bp 0x7ffeab3750e0 sp 0x7ffeab3750d8
READ of size 4 at 0x7ffeab375210 thread T0
    #0 0x400a00  (/root/tstb.exe+0x400a00)
    #1 0x7f97ce699730 in __libc_start_main (/lib64/libc.so.6+0x20730)
    #2 0x400a99  (/root/tstb.exe+0x400a99)

Address 0x7ffeab375210 is located in stack of thread T0 at offset 288 in frame
    #0 0x40080f  (/root/tstb.exe+0x40080f)

  This frame has 6 object(s):
    [32, 33) '<unknown>'
    [96, 97) '<unknown>'
    [160, 161) '<unknown>'
    [224, 225) '<unknown>'
    [288, 292) '<unknown>' <== Memory access at offset 288 is inside this variable
    [352, 368) '<unknown>'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-scope (/root/tstb.exe+0x400a00)
Shadow bytes around the buggy address:
  0x1000556669f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
  0x100055666a20: f1 f1 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2
  0x100055666a30: f2 f2 f8 f2 f2 f2 f2 f2 f2 f2 f8 f2 f2 f2 f2 f2
=>0x100055666a40: f2 f2[f8]f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f3 f3
  0x100055666a50: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100055666a90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==27575==ABORTING

If I understand correctly, it says that we are accessing a stack variable after it has already gone out of scope. Looking at the uninstrumented and unoptimized disassembly I indeed see that it happens inside instantiated __invoke_impl :

Dump of assembler code for function std::__invoke_impl<int const&, int const& (*&)(int const&), k&>(std::__invoke_other, int const& (*&)(int const&), k&):
   0x0000000000400847 <+0>:     push   %rbp
   0x0000000000400848 <+1>:     mov    %rsp,%rbp
   0x000000000040084b <+4>:     push   %rbx
   0x000000000040084c <+5>:     sub    $0x28,%rsp
   0x0000000000400850 <+9>:     mov    %rdi,-0x28(%rbp)
   0x0000000000400854 <+13>:    mov    %rsi,-0x30(%rbp)
   0x0000000000400858 <+17>:    mov    -0x28(%rbp),%rax
   0x000000000040085c <+21>:    mov    %rax,%rdi
   0x000000000040085f <+24>:    callq  0x4007a2 <std::forward<int const& (*&)(int const&)>(std::remove_reference<int const& (*&)(int const&)>::type&)>
   0x0000000000400864 <+29>:    mov    (%rax),%rbx
   0x0000000000400867 <+32>:    mov    -0x30(%rbp),%rax
   0x000000000040086b <+36>:    mov    %rax,%rdi
   0x000000000040086e <+39>:    callq  0x4005c4 <std::forward<k&>(std::remove_reference<k&>::type&)>
   0x0000000000400873 <+44>:    mov    %rax,%rdi
   0x0000000000400876 <+47>:    callq  0x40056a <k::operator int()>
   0x000000000040087b <+52>:    mov    %eax,-0x14(%rbp)
   0x000000000040087e <+55>:    lea    -0x14(%rbp),%rax
   0x0000000000400882 <+59>:    mov    %rax,%rdi
   0x0000000000400885 <+62>:    callq  *%rbx
=> 0x0000000000400887 <+64>:    add    $0x28,%rsp
   0x000000000040088b <+68>:    pop    %rbx
   0x000000000040088c <+69>:    pop    %rbp
   0x000000000040088d <+70>:    retq
End of assembler dump.

After calling k::operator int() it places the returned value on the stack and passes its address to the n(), which immediately returns it, and then it is returned from __invoke_impl itself (and goes all the way up to main's return).

So, it looks like ASAN it right here and we really have an stack-use-after-scope access.

The question is: What is wrong with my code?

I have tried building it with gcc, clang and icc and they all produce similar assembler outputs.

like image 922
Sergey Stepanov Avatar asked Feb 12 '18 14:02

Sergey Stepanov


2 Answers

std::bind essentially generates an implementation function object that calls the bound function with the desired arguments. In your case, this implementation function object is about equivalent to

struct Impl
{
    const int &operator()() const
    {
        int tmp = k_;
        return n(tmp);
    }

private:
    k k_;

    Impl(/*unspecified*/);
};

Since n returns its argument as a const reference, the call operator of Impl will return a reference to a local variable, which is a dangling reference, which is then read from in main. Hence the stack use after scope error.

Your confusion may stem from the fact that return n(l); without the bind is expected to work fine here. However, in the latter case, the temporary int is created in the stack frame of main, lives for the duration of the full expression that makes up the argument to return, which is evaluated to int.

In other words, while a temporary lives until the end of the full expression in which it was created, this is not the case for temporaries generated inside functions called within that full expression. These are considered part of a different full expression and are destroyed when that expression has been evaluated.

PS: For this reason, binding any function (object) of signature R(Args...) to a std::function<const R&(Args...)> results in a guaranteed return of a dangling reference when called – a construct that IMO the library should reject at compile time.

like image 135
Arne Vogel Avatar answered Sep 19 '22 01:09

Arne Vogel


Ok this is a tough one if you don't know the specifics about std::bind.

When binding an argument to a callable with std::bind, a copy of the argument is maid (source):

The arguments to bind are copied or moved, and are never passed by reference unless wrapped in std::ref or std::cref.

std::bind(n, l) returns a callable object of unspecified type having a member object of type k build as a copy of l. Please note this callable object is a temporary (an rvalue) I'll give it a name: bindtmp.

When invoked, bindtmp() creates a temporary (inttemp) integer (5) in order to apply bindtmp::lcopy to bindtmp::ncopy (those are the member objects constructed from main::l and ::n). ::n returns a const reference to inttemp inside the scope of bindtmp() in a return statement.

This is where things get tricky (source):

Whenever a reference is bound to a temporary or to a subobject thereof, the lifetime of the temporary is extended to match the lifetime of the reference, with the following exceptions:
- a temporary bound to a return value of a function in a return statement is not extended: it is destroyed immediately at the end of the return expression. Such function always returns a dangling reference.
- ...

This means, the temporary inttemp is destroyed after ::n has returned.

From this point, everything falls apart. bindtmp() returns a reference to an object whose lifetime has ended, main tries and convert it into an lvalue, and thi sis where undefined behaviour (odr-use of an object from the stack after its use) happens.

like image 44
YSC Avatar answered Sep 20 '22 01:09

YSC