Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Ruby's Enumerator object iterate externally over an internal iterator?

As per Ruby's documentation, the Enumerator object uses the each method (to enumerate) if no target method is provided to the to_enum or enum_for methods. Now, let's take the following monkey patch and its enumerator, as an example

o = Object.new
def o.each
    yield 1
    yield 2
    yield 3
end
e = o.to_enum

loop do
  puts e.next
end

Given that the Enumerator object uses the each method to answer when next is called, how do calls to the each method look like, every time next is called? Does the Enumeartor class pre-load all the contents of o.each and creates a local copy for enumeration? Or is there some sort of Ruby magic that hangs the operations at each yield statement until next is called on the enumeartor?

If an internal copy is made, is it a deep copy? What about I/O objects that could be used for external enumeration?

I'm using Ruby 1.9.2.

like image 916
Salman Paracha Avatar asked Jun 15 '12 19:06

Salman Paracha


1 Answers

It's not exactly magic, but it is beautiful nonetheless. Instead of making a copy of some sort, a Fiber is used to first execute each on the target enumerable object. After receiving the next object of each, the Fiber yields this object and thereby returns control back to where the Fiber was resumed initially.

It's beautiful because this approach doesn't require a copy or other form of "backup" of the enumerable object, as one could imagine obtaining by for example calling #to_a on the enumerable. The cooperative scheduling with fibers allows to switch contexts exactly when needed without the need to keep some form of lookahead.

It all happens in the C code for Enumerator. A pure Ruby version that would show roughly the same behavior could look like this:

class MyEnumerator
  def initialize(enumerable)
    @fiber = Fiber.new do
      enumerable.each { |item| Fiber.yield item }
    end
  end

  def next
    @fiber.resume || raise(StopIteration.new("iteration reached an end"))
  end
end

class MyEnumerable
  def each
    yield 1
    yield 2
    yield 3
  end
end

e = MyEnumerator.new(MyEnumerable.new)
puts e.next # => 1
puts e.next # => 2
puts e.next # => 3
puts e.next # => StopIteration is raised
like image 124
emboss Avatar answered Sep 23 '22 13:09

emboss