Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Help understanding yield and enumerators in Ruby

I would appreciate it if someone could help me understand the difference between using a Yielder in an Enumerator vs. just invoking yield in an Enumerator.

The "Well-grounded Rubyist" suggests that one doesn't "yield from the block" but doesn't explain precisely what's going on.

Thanks

like image 378
David Avatar asked Jun 14 '09 15:06

David


3 Answers

It might help if you first understand how yield works. Here is an example:

def do_stuff
  if block_given?
    yield 5
  else
    5
  end
end

result = do_stuff {|x| x * 3 }
puts result

--output:--
15

In the the do_stuff method call:

do_stuff {|x| x * 3 }

..the block is like a function, and it is passed to the method do_stuff. Inside do_stuff, yield calls the function and passes the specified arguments--in this case 5.

Some important things to note:

  1. yield is called inside a method

  2. When you call a method, you can pass a block to the method

  3. yield is used to call the block.

Okay, now let's look at your comment question:

Is it true that

e = Enumerator.new do |y| 
  y << 1 
  y << 2 
  y << 3 
end 

is exactly the same as

e = Enumerator.new do   #I think you forgot to write .new here
    yield 1 
    yield 2 
    yield 3 
end

In the second example, there is no method definition anywhere--so you can't call yield. Error! Therefore, the two examples are not the same.

However, you could do this:

def do_stuff
  e = Enumerator.new do 
      yield 1 
      yield 2 
      yield 3 
  end 
end

my_enum = do_stuff {|x| puts x*3}
my_enum.next

--output:--
3
6
9
1.rb:12:in `next': iteration reached an end (StopIteration)
    from 1.rb:12:in `<main>'

But that is a funny enumerator because it doesn't produce any values--it just executes some code(which happens to print some output), then ends. That enumerator is almost equivalent to:

def do_stuff
  e = Enumerator.new do 
  end 
end

my_enum = do_stuff
my_enum.next

--output:--
1.rb:7:in `next': iteration reached an end (StopIteration)
    from 1.rb:7:in `<main>'

When an enumerator cannot produce a value, it raises a StopIteration exception. So in both cases, the enumerator couldn't produce a value.

But it's still not clear to me what the "yielder" is doing. It looks like it is collecting all the calculated values so that it can regurgitate them later when you use the enumerator. If that's the case, then it seems like it would only be practical for "small" sequences....you wouldn't want to make an enumerator that stored 50 million items away.

No. In fact, you can create an enumerator that produces an infinite number of values. Here is an example:

e = Enumerator.new do |y|
  val = 1

  while true
    y << val
    val += 1
  end

end

puts e.next
puts e.next
puts e.next

--output:--
1
2
3

Adding some debugging messages should prove insightful:

e = Enumerator.new do |y|
  val = 1

  while true
    puts "in while loop"
    y << val
    val += 1
  end

end

puts e.next

--output:--
in while loop
1

Note that the message only printed once. So something is going on that is not obvious:

e = Enumerator.new do |y|
  val = 1

  while true
    puts "in while loop"
    y << val
    puts "just executed y << val"
    val += 1
  end

end

puts e.next

--output:--
in while loop
1

Because the message "just executed y << val" does not show up in the output, that means execution must have halted on the line y << val. Therefore, the enumerator did not continuously spin the while loop and insert all the values into y--even though the syntax is exactly the same as pushing values into an array: arr << val.

What y << val really means is: when e.next() is called produce this value, then continue execution on the next line. If you add another e.next to the previous example, you will see this additional output:

just executed y << val
in while loop
2

What's happening is that execution always halts when y << val is encountered in the code. Then calling e.next produces the value on the right side, then execution continues on the next line.

It would probably have made more sense if ruby had made the syntax for the yielder statement like this:

y >> val

And we could interpret that as meaning: halt execution here, then when e.next is called produce val.

David Black recommends not using the y.yield val syntax, which is equivalent to y << val lest readers think it works similarly to the yield statement. y.yield val should be interpreted as: "stop execution here, and when next is called produce val, then continue execution on the next line. Personally, I think that the syntax y << val stands out more than y.yield val, so it is easier to spot in the code and readily identify where execution halts.

like image 89
7stud Avatar answered Oct 06 '22 00:10

7stud


Well, unless I'm missing something, the method with yield simply doesn't work. Try it:

e = Enumerator.new do |y|
  y << 1
  y << 2
  y << 3
end

f = Enumerator.new do
  yield 1
  yield 2
  yield 3
end

e.each { |x| puts x }
f.each { |x| puts x }

Which produces this:

telemachus ~ $ ruby yield.rb 
1
2
3
yield.rb:13:in `block in <main>': no block given (yield) (LocalJumpError)
        from yield.rb:19:in `each'
        from yield.rb:19:in `each'
        from yield.rb:19:in `<main>

When he says (page 304) "you don't do this," he doesn't mean "it's not the best way to do it." He means, "that won't work."

Edit: You can, however, call yield explicitly this way:

e = Enumerator.new do |y|
  y.yield 1
  y.yield 2
  y.yield 3
end

If you find saying yield more explicit or clearer than <<, then do it that way.

Second edit: Looking at David's original post and Jorg's updated answer, I think that there was a confusion originally about the question. Jorg thought David was asking about the difference between Enumerator::Yielder#yield and Enumerator::Yielder::<<, but David wasn't sure what The Well Grounded Rubyist means when it says "don't write yield 1 etc." My answer applies to the question about The Well Grounded Rubyist. (When I looked this thread back over today, my answer looked odd in the light of other updates.)

like image 25
Telemachus Avatar answered Oct 06 '22 00:10

Telemachus


The Enumerator::Yielder#yield method and the Enumerator::Yielder::<< method are exactly the same. In fact, they are aliases.

So, which one of those two you use, is 100% personal preference, just like Enumerable#collect and Enumerable#map or Enumerable#inject and Enumerable#reduce.

like image 29
Jörg W Mittag Avatar answered Oct 06 '22 00:10

Jörg W Mittag