Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Method Lookup (comparison with JavaScript)

I want to better understand how objects in Ruby have access methods defined in classes and modules. Specifically, I want to compare and contrast it with JavaScript (which I'm more familiar with).

In JavaScript, objects look up methods on the object itself and if it can't find it there, it'll look for the method on the prototype object. This process will continue until reaching Object.prototype.

// JavaScript Example
var parent = {
  someMethod: function () {
    console.log( 'Inside Parent' );
  }
};

var child = Object.create( parent );
child.someMethod = function () {
  console.log( 'Inside Child' );
};

var obj1 = Object.create( child );
var obj2 = Object.create( child );

obj1.someMethod(); // 'Inside Child'
obj2.someMethod(); // 'Inside Child'

In the JavaScript example, both obj1 and obj2 don't have the someMethod function on the object itself. The key to note is that:

  1. There's one copy of the someMethod function in the child object and both obj1 and obj2 delegate to the child object.
  2. What this means is that neither obj nor obj2 have copies of the someMethod function on the objects themselves.
  3. If the child object didn't have the someMethod function defined, then delegation would continue to the parent object.

Now I want to contrast this with a similar example in Ruby:

# Ruby Example
class Parent
  def some_method
    put 'Inside Parent'
  end
end

class Child < Parent
  def some_method
    puts 'Inside Child'
  end
end

obj1 = Child.new
obj2 = Child.new

obj1.some_method  # 'Inside Child'
obj2.some_method  # 'Inside Child'

Here are my questions:

  1. Does obj1 and obj2 in the Ruby code each own a copy of the some_method method? Or is it similar to JavaScript where both objects have access to some_method via another object (in this case, via the Child class)?
  2. Similarly, when inheritance is taken into account in Ruby, does each Ruby object have a copy of all of the class and superclass methods of the same name?

My gut tells me that Ruby objects DO NOT have separate copies of the methods inherited from their class, mixed-in modules, and superclasses. Instead, my gut is that Ruby handles method lookup similarly to JavaScript, where objects check if the object itself has the method and if not, it looks up the method in the object's class, mixed-in modules, and superclasses until the lookup reaches BasicObject.

like image 398
wmock Avatar asked Sep 29 '22 20:09

wmock


2 Answers

  1. Does obj1 and obj2 in the Ruby code each own a copy of the some_method method? Or is it similar to JavaScript where both objects have access to some_method via another object (in this case, via the Child class)?

You don't know. The Ruby Language Specification simply says "if you do this, that happens". It does, however, not prescribe a particular way of making that happen. Every Ruby implementation is free to implement it in the way it sees fit, as long as the results match those of the spec, the spec doesn't care how those results were obtained.

You can't tell. If the implementation maintains proper abstraction, it will be impossible for you to tell how they do it. That is just the nature of abstraction. (It is, in fact, pretty much the definition of abstraction.)

  1. Similarly, when inheritance is taken into account in Ruby, does each Ruby object have a copy of all of the class and superclass methods of the same name?

Same as above.

There are a lot of Ruby implementations currently, and there have been even more in the past, in various stages of (in)completeness. Some of those implement(ed) their own object models (e.g. MRI, YARV, Rubinius, MRuby, Topaz, tinyrb, RubyGoLightly), some sit on top of an existing object model into which they are trying to fit (e.g. XRuby and JRuby on Java, Ruby.NET and IronRuby on the CLI, SmallRuby, smalltalk.rb, Alumina, and MagLev on Smalltalk, MacRuby and RubyMotion on Objective-C/Cocoa, Cardinal on Parrot, Red Sun on ActionScript/Flash, BlueRuby on SAP/ABAP, HotRuby and Opal.rb on ECMAScript)

Who is to say that all of those implementations work exactly the same?

My gut tells me that Ruby objects DO NOT have separate copies of the methods inherited from their class, mixed-in modules, and superclasses. Instead, my gut is that Ruby handles method lookup similarly to JavaScript, where objects check if the object itself has the method and if not, it looks up the method in the object's class, mixed-in modules, and superclasses until the lookup reaches BasicObject.

Despite what I wrote above, that is a reasonable assumption, and is in fact, how the implementations that I know about (MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, Topaz) work.

Just think about what it would mean if it weren't so. Every instance of the String class would need to have its own copy of all of String's 116 methods. Think about how many Strings there are in a typical Ruby program!

ruby -e 'p ObjectSpace.each_object(String).count'
# => 10013

Even in this most trivial of programs, which doesn't require any libraries, and only creates one single string itself (for printing the number to the screen), there are already more than 10000 strings. Every single one of those would have its own copies of the over 100 String methods. That would be huge waste of memory.

It would also be a synchronization nightmare! Ruby allows you to monkeypatch methods at any time. What if I redefine a method in the String class? Ruby would now have to update every single copy of that method, even across different threads.

And I actually only counted public methods defined directly in String. Taking into account private methods, the number of methods is even bigger. And of course, there is inheritance: strings wouldn't just need a copy of every method in String, but also a copy of every method in Comparable, Object, Kernel, and BasicObject. Can you imagine every object in the system having a copy of require?

No, the way it works in most Ruby implementations is like this. An object has an identity, instance variables, and a class (in statically typed pseudo-Ruby):

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
end

A module has a method dictionary, a constant dictionary, and a class variable dictionary:

struct Module
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
end

A class is like a module, but it also has a superclass:

struct Class
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  superclass: *Class
end

When you call a method on an object, Ruby will look up the object's class pointer and try to find the method there. If it doesn't, it will look at the class's superclass pointer and so on, until it reaches a class which has no superclass. At that point it will actually not give up, but try to call the method_missing method on the original object, passing the name of the method you tried to call as an argument, but that's just a normal method call, too, so it follows all the same rules (except that if a call to method_missing reaches the top of the hierarchy, it will not try to call it again, that would result in an infinite loop).

Oh, but we ignored one thing: singleton methods! Every object needs to have its own method dictionary as well. Actually, rather, every object has its own private singleton class in addition to its class:

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
  singleton_class: Class
end

So, method lookup starts first in the singleton class, and only then goes to the class.

And what about mixins? Oh, right, every module and class also needs a list of its included mixins:

struct Module
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  mixins: List<*Module>
end

struct Class
  methods: Dictionary<Symbol, *Method>
  constants: Dictionary<Symbol, *Object>
  cvars: Dictionary<Symbol, *Object>
  superclass: *Class
  mixins: List<*Module>
end

Now, the algorithm goes: look first in the singleton class, then the class and then the superclass(es), where however, "look" also means "after you look at the method dictionary, also look at all the method dictionaries of the included mixins (and the included mixins of the included mixins, and so forth, recursively) before going up to the superclass".

Does that sound complicated? It is! And that's not good. Method lookup is the single most often executed algorithm in an object-oriented system, it needs to be simple and lightning fast. So, what some Ruby implementations (e.g. MRI, YARV) do, is to decouple the interpreter's internal notion of what "class" and "superclass" mean from the programmer's view of those same concepts.

An object no longer has both a singleton class and a class, it just has a class:

struct Object
  object_id: Id
  ivars: Dictionary<Symbol, *Object>
  class: *Class
  singleton_class: Class
end

A class no longer has a list of included mixins, just a superclass. It may, however, be hidden. Note also that the Dictionaries become pointers, you'll see why in a moment:

struct Class
  methods: *Dictionary<Symbol, *Method>
  constants: *Dictionary<Symbol, *Object>
  cvars: *Dictionary<Symbol, *Object>
  superclass: *Class
  visible?: Bool
end

Now, the object's class pointer will always point to the singleton class, and the singleton class's superclass pointer will always point to the object's actual class. If you include a mixin M into a class C, Ruby will create a new invisible class M′ which shares its method, constant and cvar dictionaries with the mixin. This mixin class will become the superclass of C, and the old superclass of C will become the superclass of the mixin class:

M′ = Class.new(
  methods = M->methods
  constants = M->constants
  cvars = M->cvars
  superclass = C->superclass
  visible? = false
)

C->superclass = *M'

Actually, it's little bit more involved, since it also has to create classes for the mixins that are included in M (and recursively), but in the end, what we end up with is a nice linear method lookup path with no side-stepping into singleton classes and included mixins.

Now, the method lookup algorithm is just this:

def lookup(meth, obj)
  c = obj->class

  until res = c->methods[meth]
    c = c->superclass
    raise MethodNotFound, meth if c.nil?
  end

  res
end

Nice and clean and lean and fast.

As a trade-off, finding out the class of an object or the superclass of a class is slightly more difficult, because you can't simply return the class or superclass pointer, you have to walk the chain until you find a class that is not hidden. But how often do you call Object#class or Class#superclass? Do you even call it at all, outside of debugging?

Unfortunately, Module#prepend doesn't fit cleanly into this picture. And Refinements really mess things up, which is why many Ruby implementations don't even implement them.

like image 63
Jörg W Mittag Avatar answered Oct 03 '22 01:10

Jörg W Mittag


Let's continue working with your example in an IRB session and see what we might learn:

> obj1.method(:some_method)
=> #<Method: Child#some_method>
> obj1.method(:some_method).source_location
=> ["(irb)", 8]
> obj2.method(:some_method)
=> #<Method: Child#some_method>
> obj2.method(:some_method).source_location
=> ["(irb)", 8]

Ah ok, so two objects of the same class have the same Method. I wonder if that's always true...

> obj1.instance_eval do
>   def some_method
>     puts 'what is going on here?'
>   end
> end
=> nil
> obj1.some_method
what is going on here?
=> nil
> obj2.some_method
Inside Child
=> nil
> obj1.method(:some_method)
=> #<Method: #<Child:0x2b9c128>.some_method>
> obj1.method(:some_method).source_location
=> ["(irb)", 19]

Well that's interesting.

James Coglan has a nice blog post which is offers a better explanation of much of this than I will at https://blog.jcoglan.com/2013/05/08/how-ruby-method-dispatch-works/

It might also be interesting to consider when any of this is important. Think about how much of this system is an implementation detail of the interpreter and could be handled differently in MRI, JRuby, and Rubinius and what actually needs to be consistent for a Ruby program to execute consistently in all of them.

like image 28
Jonah Avatar answered Oct 03 '22 00:10

Jonah