I was reading a Ruby question about the <code>.each</code> iterator, and someone stated that using <code>.each</code> can be a code smell if higher order iterators are better suited for the task. What are higher order iterators in Ruby? edit: Jörg W Mittag, the author of the StackOverflow answer that I was referring to mentioned that he meant to write higher level iterators, but he also explained what they are very well below.

Oops. I meant higher-level iterators, not higher-order. Every iterator is of course by definition higher-order. Basically, iteration is a very low-level concept. The purpose of programming is to communicate intent to the other stakeholders on the team. "Initializing an empty array, then iterating over another array and adding the current element of this array to the first array if it is divisible by two without a remainder" is not communicating intent. "Selecting all even numbers" is. In general, you almost never iterate over a collection just for iteration's sake. You either want to <ul> <li>transform each element in some way (that's usually called <code>map</code>, in Ruby and Smalltalk it's <code>collect</code> and in .NET and SQL it's <code>Select</code>),</li> <li>reduce the whole collection down to some single value, e.g. computing the sum or the average or the standard deviation of a list of football scores (in category theory, that's called a catamorphism, in functional programming it is <code>fold</code> or <code>reduce</code>, in Smalltalk it's <code>inject:into:</code>, in Ruby it's <code>inject</code> and in .NET <code>Aggregate</code>),</li> <li>filter out all elements that satisfy a certain condition (<code>filter</code> in most functional languages, <code>select</code> in Smalltalk and Ruby, also <code>find_all</code> in Ruby, <code>Where</code> in .NET and SQL),</li> <li>filter out all elements that do not satisfy a condition (<code>reject</code> in Smalltalk and Ruby)</li> <li>find the first element that satisfies a condition (<code>find</code> in Ruby)</li> <li>count the elements thats satisfy a condition (<code>count</code> in Ruby)</li> <li>check if all elements (<code>all?</code>), at least one element (<code>any?</code>) or no elements (<code>none?</code>) satisfy a condition</li> <li>group the elements into buckets based on some discriminator (<code>group_by</code> in Ruby, .NET and SQL)</li> <li>partition the collection into two collections based on some predicate (<code>partition</code>)</li> <li>sort the collection (<code>sort</code>, <code>sort_by</code>)</li> <li>combine multiple collections into one (<code>zip</code>)</li> <li>and so on and so forth …</li> </ul> Almost never is your goal to just iterate over a collection. In particular, <code>reduce</code> aka. <code>inject</code> aka. <code>fold</code> aka. <code>inject:into:</code> aka. <code>Aggregate</code> aka. catamorphism is your friend. There's a reason why it has such a fancy-sounding mathematical name: it is extremely powerful. In fact, most of what I mentioned above, can be implemented in terms of <code>reduce</code>. Basically, what <code>reduce</code> does, is it "reduces" the entire collection down to a single value, using some function. You have some sort of accumulator value, and then you take the accumulator value and the first element and feed it into the function. The result of that function then becomes the new accumulator, which you pair up with the second element and feed to the function and so on. The most obvious example of this is summing a list of numbers: <pre class="prettyprint"><code>[4, 8, 15, 16, 23, 42].reduce(0) {|acc, elem| acc + elem } </code></pre> So, the accumulator starts out as <code>0</code>, and we pass the first element <code>4</code> into the <code>+</code> function. The result is <code>4</code>, which becomes the new accumulator. Now we pass the next element <code>8</code> in and the result is <code>12</code>. And this continues till the last element and the result is that they were dead the whole time. No, wait, the result is <code>108</code>. Ruby actually allows us to take a couple of shortcuts: If the element type is the same as the accumulator type, you can leave out the accumulator and Ruby will simply pass the first element as the first value for the accumulator: <pre class="prettyprint"><code>[4, 8, 15, 16, 23, 42].reduce {|acc, elem| acc + elem } </code></pre> Also, we can use <code>Symbol#to_proc</code> here: <pre class="prettyprint"><code>[4, 8, 15, 16, 23, 42].reduce(&:+) </code></pre> And actually, if you pass <code>reduce</code> a <code>Symbol</code> argument it will treat as the name of the function to use for the reduction operation: <pre class="prettyprint"><code>[4, 8, 15, 16, 23, 42].reduce(:+) </code></pre> However, summing is not all that <code>reduce</code> can do. In fact, I find this example a little dangerous. Everybody I showed this to, immediately understood, "Aah, so that's what a <code>reduce</code> is", but unortunately some also thought that summing numbers is all <code>reduce</code> is, and that's definitely not the case. In fact, <code>reduce</code> is a general method of iteration, by which I mean that <code>reduce</code> can do anything that <code>each</code> can do. In particular, you can store arbitrary state in the accumulator. For example, I wrote above that <code>reduce</code> reduces the collection down to a single value. But of course that "single value" can be arbitrarily complex. It could, for example, be itself a collection. Or a string: <pre class="prettyprint"><code>class Array def mystery_method(foo) drop(1).reduce("#{first}") {|s, el| s << foo.to_str << el.to_s } end end </code></pre> This is an example how far you can go with playing tricks with the accumulator. If you try it out, you'll of course recognize it as <code>Array#join</code>: <pre class="prettyprint"><code>class Array def join(sep=$,) drop(1).reduce("#{first}") {|s, el| s << sep.to_str << el.to_s } end end </code></pre> Note that nowhere in this "loop" do I have to keep track of whether I'm at the last or second-to-last element. Nor is there any conditional in the code. There is no potential for fencepost errors here. If you think about how to implement this with <code>each</code>, you would have to somehow keep track of the index and check whether you are at the last element and then have an <code>if</code> in there, to prevent emitting the separator at the end. Since I wrote above that all iteration can be done with <code>reduce</code>, I might just as well prove it. Here's Ruby's <code>Enumerable</code> methods, implemented in terms of <code>reduce</code> instead of <code>each</code> as they normally are. (Note that I only just started and have only arrived at g yet.) <pre class="prettyprint"><code>module Enumerable def all? reduce(true) {|res, el| res && yield(el) } end def any? reduce(false) {|res, el| res || yield(el) } end alias_method :map, def collect reduce([]) {|res, el| res << yield(el) } end def count reduce(0) {|res, el| res + 1 if yield el } end alias_method :find, def detect reduce(nil) {|res, el| if yield el then el end unless res } end def drop(n=1) reduce([]) {|res, el| res.tap {|res| res << el unless n -= 1 >= 0 }} end def drop_while reduce([]) {|res, el| res.tap {|res| res << el unless yield el }} end def each reduce(nil) {|_, el| yield el } end def each_with_index tap { reduce(-1) {|i, el| (i+1).tap {|i| yield el, i }}} end alias_method :select, def find_all reduce([]) {|res, el| res.tap {|res| res << el if yield el }} end def grep(pattern) reduce([]) {|res, el| res.tap {|res| res << yield(el) if pattern === el }} end def group_by reduce(Hash.new {|hsh, key| hsh[key] = [] }) {|res, el| res.tap {|res| res[yield el] = el }} end def include?(obj) reduce(false) {|res, el| break true if res || el == obj } end def reject reduce([]) {|res, el| res.tap {|res| res << el unless yield el }} end end </code></pre> [Note: I made some simplifications for the purpose of this post. For example, according to the standard Ruby Enumerable protocol, <code>each</code> is supposed to return <code>self</code>, so you'd have to slap an extra line in there; other methods behave slightly differently, depending on what kind and how many arguments you pass in and so on. I left those out because they distract from the point I am trying to make.]

Higher order iterators in Ruby?

Tags:

iterator

ruby

I was reading a Ruby question about the .each iterator, and someone stated that using .each can be a code smell if higher order iterators are better suited for the task. What are higher order iterators in Ruby?

edit: Jörg W Mittag, the author of the StackOverflow answer that I was referring to mentioned that he meant to write higher level iterators, but he also explained what they are very well below.

791

asked Aug 17 '10 15:08

jergason

2 Answers

Oops. I meant higher-level iterators, not higher-order. Every iterator is of course by definition higher-order.

Basically, iteration is a very low-level concept. The purpose of programming is to communicate intent to the other stakeholders on the team. "Initializing an empty array, then iterating over another array and adding the current element of this array to the first array if it is divisible by two without a remainder" is not communicating intent. "Selecting all even numbers" is.

In general, you almost never iterate over a collection just for iteration's sake. You either want to

transform each element in some way (that's usually called map, in Ruby and Smalltalk it's collect and in .NET and SQL it's Select),
reduce the whole collection down to some single value, e.g. computing the sum or the average or the standard deviation of a list of football scores (in category theory, that's called a catamorphism, in functional programming it is fold or reduce, in Smalltalk it's inject:into:, in Ruby it's inject and in .NET Aggregate),
filter out all elements that satisfy a certain condition (filter in most functional languages, select in Smalltalk and Ruby, also find_all in Ruby, Where in .NET and SQL),
filter out all elements that do not satisfy a condition (reject in Smalltalk and Ruby)
find the first element that satisfies a condition (find in Ruby)
count the elements thats satisfy a condition (count in Ruby)
check if all elements (all?), at least one element (any?) or no elements (none?) satisfy a condition
group the elements into buckets based on some discriminator (group_by in Ruby, .NET and SQL)
partition the collection into two collections based on some predicate (partition)
sort the collection (sort, sort_by)
combine multiple collections into one (zip)
and so on and so forth …

Almost never is your goal to just iterate over a collection.

In particular, reduce aka. inject aka. fold aka. inject:into: aka. Aggregate aka. catamorphism is your friend. There's a reason why it has such a fancy-sounding mathematical name: it is extremely powerful. In fact, most of what I mentioned above, can be implemented in terms of reduce.

Basically, what reduce does, is it "reduces" the entire collection down to a single value, using some function. You have some sort of accumulator value, and then you take the accumulator value and the first element and feed it into the function. The result of that function then becomes the new accumulator, which you pair up with the second element and feed to the function and so on.

The most obvious example of this is summing a list of numbers:

[4, 8, 15, 16, 23, 42].reduce(0) {|acc, elem|
  acc + elem
}

So, the accumulator starts out as 0, and we pass the first element 4 into the + function. The result is 4, which becomes the new accumulator. Now we pass the next element 8 in and the result is 12. And this continues till the last element and the result is that they were dead the whole time. No, wait, the result is 108.

Ruby actually allows us to take a couple of shortcuts: If the element type is the same as the accumulator type, you can leave out the accumulator and Ruby will simply pass the first element as the first value for the accumulator:

[4, 8, 15, 16, 23, 42].reduce {|acc, elem|
  acc + elem
}

Also, we can use Symbol#to_proc here:

[4, 8, 15, 16, 23, 42].reduce(&:+)

And actually, if you pass reduce a Symbol argument it will treat as the name of the function to use for the reduction operation:

[4, 8, 15, 16, 23, 42].reduce(:+)

However, summing is not all that reduce can do. In fact, I find this example a little dangerous. Everybody I showed this to, immediately understood, "Aah, so that's what a reduce is", but unortunately some also thought that summing numbers is all reduce is, and that's definitely not the case. In fact, reduce is a general method of iteration, by which I mean that reduce can do anything that each can do. In particular, you can store arbitrary state in the accumulator.

For example, I wrote above that reduce reduces the collection down to a single value. But of course that "single value" can be arbitrarily complex. It could, for example, be itself a collection. Or a string:

class Array
  def mystery_method(foo)
    drop(1).reduce("#{first}") {|s, el| s << foo.to_str << el.to_s }
  end
end

This is an example how far you can go with playing tricks with the accumulator. If you try it out, you'll of course recognize it as Array#join:

class Array
  def join(sep=$,)
    drop(1).reduce("#{first}") {|s, el| s << sep.to_str << el.to_s }
  end
end

Note that nowhere in this "loop" do I have to keep track of whether I'm at the last or second-to-last element. Nor is there any conditional in the code. There is no potential for fencepost errors here. If you think about how to implement this with each, you would have to somehow keep track of the index and check whether you are at the last element and then have an if in there, to prevent emitting the separator at the end.

Since I wrote above that all iteration can be done with reduce, I might just as well prove it. Here's Ruby's Enumerable methods, implemented in terms of reduce instead of each as they normally are. (Note that I only just started and have only arrived at g yet.)

module Enumerable
  def all?
    reduce(true) {|res, el| res && yield(el) }
  end

  def any?
    reduce(false) {|res, el| res || yield(el) }
  end

  alias_method :map, def collect
    reduce([]) {|res, el| res << yield(el) }
  end

  def count
    reduce(0) {|res, el| res + 1 if yield el }
  end

  alias_method :find, def detect
    reduce(nil) {|res, el| if yield el then el end unless res }
  end

  def drop(n=1)
    reduce([]) {|res, el| res.tap {|res| res << el unless n -= 1 >= 0 }}
  end

  def drop_while
    reduce([]) {|res, el| res.tap {|res| res << el unless yield el }}
  end

  def each
    reduce(nil) {|_, el| yield el }
  end

  def each_with_index
    tap { reduce(-1) {|i, el| (i+1).tap {|i| yield el, i }}}
  end

  alias_method :select, def find_all
    reduce([]) {|res, el| res.tap {|res| res << el if yield el }}
  end

  def grep(pattern)
    reduce([]) {|res, el| res.tap {|res| res << yield(el) if pattern === el }}
  end

  def group_by
    reduce(Hash.new {|hsh, key| hsh[key] = [] }) {|res, el| res.tap {|res|
        res[yield el] = el
    }}
  end

  def include?(obj)
    reduce(false) {|res, el| break true if res || el == obj }
  end

  def reject
    reduce([]) {|res, el| res.tap {|res| res << el unless yield el }}
  end
end

[Note: I made some simplifications for the purpose of this post. For example, according to the standard Ruby Enumerable protocol, each is supposed to return self, so you'd have to slap an extra line in there; other methods behave slightly differently, depending on what kind and how many arguments you pass in and so on. I left those out because they distract from the point I am trying to make.]

196

answered Oct 02 '22 00:10

Jörg W Mittag

They're talking about more specialized methods such as map, filter or inject. For example, instead of this:

even_numbers = []
numbers.each {|num| even_numbers << num if num.even?}

You should do this:

even_numbers = numbers.select {|num| num.even?}

It says what you want to do but encapsulates all the irrelevant technical details in the select method. (And incidentally, in Ruby 1.8.7 or later, you can just write even_numbers = numbers.select(&:even?), so even more concise if slightly Perl-like.)

These aren't normally called "higher-order iterators," but whoever wrote that probably just had a minor mental mixup. It's a good principle whatever terminology you use.

answered Oct 02 '22 01:10

Chuck

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Higher order iterators in Ruby?

Tags:

iterator

ruby

jergason

People also ask

2 Answers

Jörg W Mittag

Chuck

Recent Activity

Donate For Us

Higher order iterators in Ruby?

Tags:

iterator

ruby

jergason

People also ask

2 Answers

Jörg W Mittag

Chuck

Related questions

Recent Activity

Donate For Us