Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens when you use string interpolation in ruby?

Tags:

ruby

I thought that ruby just call method to_s but I can't explain how this works:

class Fake
  def to_s
    self
  end
end

"#{Fake.new}"

By the logic this should raise stack level too deep because of infinity recursion. But it works fine and seems to call #to_s from an Object.

=> "#<Fake:0x137029f8>"

But why?

ADDED:

class Fake
  def to_s
    Fake2.new
  end
end

class Fake2
  def to_s
    "Fake2#to_s"
  end
end

This code works differently in two cases:

puts "#{Fake.new}" => "#<Fake:0x137d5ac4>"

But:

puts Fake.new.to_s => "Fake2#to_s"

I think it's abnormal. Can somebody suggest when in ruby interpreter it happens internally?

like image 691
abonec Avatar asked Aug 25 '14 15:08

abonec


People also ask

What is string interpolation in Ruby?

String Interpolation refers to substitution of defined variables or expressions in a given String with respected values. This is how, string Interpolation works, it executes whatever that is executable. Let's see how to execute numbers and strings. Syntax: #{variable}

What is string interpolation method?

String interpolation is a technique that enables you to insert expression values into literal strings. It is also known as variable substitution, variable interpolation, or variable expansion. It is a process of evaluating string literals containing one or more placeholders that get replaced by corresponding values.

How does interpolation work in programming?

Interpolation is a method of constructing new data points within range of discrete set of known data points. The number of data points obtained by sampling or experimentation represents values of function for limited number of values of independent variable.

What does #{} do in Ruby?

Instead of terminating the string and using the + operator, you enclose the variable with the #{} syntax. This syntax tells Ruby to evaluate the expression and inject it into the string. This is the same program you've already written, but this time we're using string interpolation to create the output.


1 Answers

Short version

Ruby does call to_s, but it checks that to_s returns a string. If it doesn't, ruby calls the default implementation of to_s instead. Calling to_s recursively wouldn't be a good idea (no guarantee of termination) - you could crash the VM and ruby code shouldn't be able to crash the whole VM.

You get different output from Fake.new.to_s because irb calls inspect to display the result to you, and inspect calls to_s a second time

Long version

To answer "what happens when ruby does x", a good place to start is to look at what instructions get generated for the VM (this is all MRI specific). For your example:

puts RubyVM::InstructionSequence.compile('"#{Foo.new}"').disasm

outputs

0000 trace            1                                               (   1)
0002 getinlinecache   9, <is:0>
0005 getconstant      :Foo
0007 setinlinecache   <is:0>
0009 opt_send_simple  <callinfo!mid:new, argc:0, ARGS_SKIP>
0011 tostring         
0012 concatstrings    1
0014 leave      

There's some messing around with the cache, and you'll always get trace, leave but in a nutshell this says.

  1. get the constant Foo
  2. call its new method
  3. execute the tostring instruction
  4. execute the concatstrings instruction with the result of the tostring instruction (the last value on the stack (if you do this with multiple #{} sequences you can see it building up all the individual strings and then calling concatstrings once on all consuming all of those strings)

The instructions in this dump are defined in insns.def: this maps these instructions to their implementation. You can see that tostring just calls rb_obj_as_string.

If you search for rb_obj_as_string through the ruby codebase (I find http://rxr.whitequark.org useful for this) you can see it's defined here as

VALUE
rb_obj_as_string(VALUE obj)
{
    VALUE str;

    if (RB_TYPE_P(obj, T_STRING)) {
    return obj;
    }
    str = rb_funcall(obj, id_to_s, 0);
    if (!RB_TYPE_P(str, T_STRING))
    return rb_any_to_s(obj);
    if (OBJ_TAINTED(obj)) OBJ_TAINT(str);
    return str;
}

In brief, if we already have a string then return that. If not, call the object's to_s method. Then, (and this is what is crucial for your question), it checks the type of the result. If it's not a string it returns rb_any_to_s instead, which is the function that implements the default to_s

like image 130
Frederick Cheung Avatar answered Nov 15 '22 20:11

Frederick Cheung