Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the Ruby debugger return different values than the code at run time?

See this simple Ruby class:

require 'byebug'

class Foo
  def run
    byebug

    puts defined?(bar)
    puts bar.inspect

    bar = 'local string'

    puts defined?(bar)
    puts bar.inspect
  end

  def bar
    'string from method'
  end
end

Foo.new.run

When running this class the following behavior can be observed in the debugger's console:

    $ ruby byebug.rb

    [2, 11] in /../test.rb
        2:
        3: class Foo
        4:   def run
        5:     byebug
        6:
    =>  7:     puts defined?(bar)
        8:     puts bar.inspect
        9:
       10:     bar = 'local string'
       11:

At the breakpoint the debugger returns the following values:

    (byebug) defined?(bar)
    "local-variable"
    (byebug) bar.inspect
    "nil"

Note that - although the debugger's breakpoint is in line #5 - it already knows that there will be a local variable bar defined in line #10 that will shadow the method bar and the debugger is actually not able anymore to call the bar method. What is doesn't know at this point is that the string 'local string' will be assign to bar. The debugger returns nil for bar.

Let's continue with the original code in the Ruby file and look at its output:

    (byebug) continue
    method
    "string from method"
    local-variable
    "local string"

At run time in line #7 Ruby still knowns that bar is indeed a method and it is still able to call it in line #8. Then l ine #10 actually defines the local variable that shadows the method with the same name and tTherefore Ruby returns like expected in line #12 and #13.

Questions: Why does the debugger return different values than the original code? It seems like it is able to look into the future. Is this considered a feature or a bug? Is this behavior documented?

like image 705
spickermann Avatar asked Mar 09 '17 13:03

spickermann


People also ask

How do I run a Ruby script in debug mode?

In order to start the Ruby debugger, load the debug library using the command-line option -r debug. The debugger stops before the first line of executable code and asks for the input of user commands.


1 Answers

Whenever you drop into a debugging session, you're effectively executing an eval against the binding at that spot in the code. Here's a simpler bit of code that recreates the behavior that's driving you nuts:

def make_head_explode
  puts "== Proof bar isn't defined"
  puts defined?(bar)   # => nil

  puts "== But WTF?! It shows up in eval"
  eval(<<~RUBY)
    puts defined?(bar) # => 'local-variable'
    puts bar.inspect   # => nil
  RUBY

  bar = 1
  puts "\n== Proof bar is now defined"
  puts defined?(bar)   # => 'local-variable'
  puts bar.inspect     # => 1
end

When the method make_head_explode is fed to the interpreter, it's compiled to YARV instructions, a local table, which stores information about the method's arguments and all local variables in the method, and a catch table that includes information about rescues within the method if present.

The root cause of this issue is that since you're compiling code dynamically at runtime with eval, Ruby passes the local table, which includes an unset variable enry, to eval as well.

To start, let's use a use a very simple method that demonstrates the behavior we'd expect.

def foo_boom
  foo         # => NameError
  foo = 1     # => 1
  foo         # => 1
end

We can inspect this by extracting the YARV byte code for the existing method with RubyVM::InstructionSequence.disasm(method). Note I'm going to ignore trace calls to keep the instructions tidy.

Output for RubyVM::InstructionSequence.disasm(method(:foo_boom)) less trace:

== disasm: #<ISeq:foo_boom@(irb)>=======================================
local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] foo
0004 putself
0005 opt_send_without_block <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>
0008 pop
0011 putobject_OP_INT2FIX_O_1_C_
0012 setlocal_OP__WC__0 2
0016 getlocal_OP__WC__0 2
0020 leave                                                            ( 253)

Now let's walk through the trace.

local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] foo

We can see here that YARV has identified we have the local variable foo, and stored it in our local table at index [2]. If we had other local variables and arguments, they'd also appear in this table.

Next we have the instructions generated when we try to call foo before its assigned:

  0004 putself
  0005 opt_send_without_block <callinfo!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>, <callcache>
  0008 pop

Let's dissect what happens here. Ruby compiles function calls for YARV according to the following pattern:

  • Push receiver: putself, referring to top-level scope of function
  • Push arguments: none here
  • Call the method/function: function call (FCALL) to foo

Next we have the instructions for setting at getting foo once it becomes a global variable:

0008 pop
0011 putobject_OP_INT2FIX_O_1_C_
0012 setlocal_OP__WC__0 2
0016 getlocal_OP__WC__0 2
0020 leave                                                            ( 253)

Key takeaway: when YARV has the entire source code at hand, it knows when locals are defined and treats premature calls to local variables as FCALLs just as you'd expect.

Now let's look at a "misbehaving" version that uses eval

def bar_boom
  eval 'bar'     # => nil, but we'd expect an errror
  bar = 1         # => 1
  bar
end

Output for RubyVM::InstructionSequence.disasm(method(:bar_boom)) less trace:

== disasm: #<ISeq:bar_boom@(irb)>=======================================
local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] bar
0004 putself
0005 putstring        "bar"
0007 opt_send_without_block <callinfo!mid:eval, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
0010 pop
0013 putobject_OP_INT2FIX_O_1_C_
0014 setlocal_OP__WC__0 2
0018 getlocal_OP__WC__0 2
0022 leave                                                            ( 264)

Again we see a local variable, bar, in the locals table at index 2. We also have the following instructions for eval:

0004 putself
0005 putstring        "bar"
0007 opt_send_without_block <callinfo!mid:eval, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
0010 pop

Let's dissect what happens here:

  • Push receiver: again putself, referring to top-level scope of function
  • Push arguments: "bar"
  • Call the method/function: function call (FCALL) to eval

Afterward, we have the standard assignment to bar that we'd expect.

0013 putobject_OP_INT2FIX_O_1_C_
0014 setlocal_OP__WC__0 2
0018 getlocal_OP__WC__0 2
0022 leave                                                            ( 264)

Had we not had eval here, Ruby would have known to treat the call to bar as a function call, which would have blown up as it did in our previous example. However, since eval is dynamically evaluated and the instructions for its code won't be generated until runtime, the evaluation occurs in the context of the already determined instructions and local table, which holds the phantom bar that you see. Unfortunately, at this stage, Ruby is unaware that bar was initialized "below" the eval statement.

For a deeper dive, I'd recommend reading Ruby Under a Microscope and the Ruby Hacking Guide's section on Evaluation.

like image 96
fny Avatar answered Nov 12 '22 13:11

fny