Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Higher order iterators in Ruby?

Tags:

iterator

ruby

I was reading a Ruby question about the .each iterator, and someone stated that using .each can be a code smell if higher order iterators are better suited for the task. What are higher order iterators in Ruby?

edit: Jörg W Mittag, the author of the StackOverflow answer that I was referring to mentioned that he meant to write higher level iterators, but he also explained what they are very well below.

like image 791
jergason Avatar asked Aug 17 '10 15:08

jergason


People also ask

What is the types of iterators in Ruby?

“Iterators” is the object-oriented concept in Ruby. In more simple words, iterators are the methods which are supported by collections(Arrays, Hashes etc.). Collections are the objects which store a group of data members. Ruby iterators return all the elements of a collection one after another.

Are iterators faster than for loops?

Iterators. To determine whether to use loops or iterators, you need to know which implementation is faster: the version of the search function with an explicit for loop or the version with iterators. The iterator version was slightly faster!

Is there a for loop in Ruby?

Ruby provides the different types of loop to handle the condition based situation in the program to make the programmers task simpler. The loops in Ruby are : while loop. for loop.

How do you break out of a loop in Ruby?

In Ruby, we use a break statement to break the execution of the loop in the program. It is mostly used in while loop, where value is printed till the condition, is true, then break statement terminates the loop. In examples, break statement used with if statement. By using break statement the execution will be stopped.


2 Answers

Oops. I meant higher-level iterators, not higher-order. Every iterator is of course by definition higher-order.

Basically, iteration is a very low-level concept. The purpose of programming is to communicate intent to the other stakeholders on the team. "Initializing an empty array, then iterating over another array and adding the current element of this array to the first array if it is divisible by two without a remainder" is not communicating intent. "Selecting all even numbers" is.

In general, you almost never iterate over a collection just for iteration's sake. You either want to

  • transform each element in some way (that's usually called map, in Ruby and Smalltalk it's collect and in .NET and SQL it's Select),
  • reduce the whole collection down to some single value, e.g. computing the sum or the average or the standard deviation of a list of football scores (in category theory, that's called a catamorphism, in functional programming it is fold or reduce, in Smalltalk it's inject:into:, in Ruby it's inject and in .NET Aggregate),
  • filter out all elements that satisfy a certain condition (filter in most functional languages, select in Smalltalk and Ruby, also find_all in Ruby, Where in .NET and SQL),
  • filter out all elements that do not satisfy a condition (reject in Smalltalk and Ruby)
  • find the first element that satisfies a condition (find in Ruby)
  • count the elements thats satisfy a condition (count in Ruby)
  • check if all elements (all?), at least one element (any?) or no elements (none?) satisfy a condition
  • group the elements into buckets based on some discriminator (group_by in Ruby, .NET and SQL)
  • partition the collection into two collections based on some predicate (partition)
  • sort the collection (sort, sort_by)
  • combine multiple collections into one (zip)
  • and so on and so forth …

Almost never is your goal to just iterate over a collection.

In particular, reduce aka. inject aka. fold aka. inject:into: aka. Aggregate aka. catamorphism is your friend. There's a reason why it has such a fancy-sounding mathematical name: it is extremely powerful. In fact, most of what I mentioned above, can be implemented in terms of reduce.

Basically, what reduce does, is it "reduces" the entire collection down to a single value, using some function. You have some sort of accumulator value, and then you take the accumulator value and the first element and feed it into the function. The result of that function then becomes the new accumulator, which you pair up with the second element and feed to the function and so on.

The most obvious example of this is summing a list of numbers:

[4, 8, 15, 16, 23, 42].reduce(0) {|acc, elem|
  acc + elem
}

So, the accumulator starts out as 0, and we pass the first element 4 into the + function. The result is 4, which becomes the new accumulator. Now we pass the next element 8 in and the result is 12. And this continues till the last element and the result is that they were dead the whole time. No, wait, the result is 108.

Ruby actually allows us to take a couple of shortcuts: If the element type is the same as the accumulator type, you can leave out the accumulator and Ruby will simply pass the first element as the first value for the accumulator:

[4, 8, 15, 16, 23, 42].reduce {|acc, elem|
  acc + elem
}

Also, we can use Symbol#to_proc here:

[4, 8, 15, 16, 23, 42].reduce(&:+)

And actually, if you pass reduce a Symbol argument it will treat as the name of the function to use for the reduction operation:

[4, 8, 15, 16, 23, 42].reduce(:+)

However, summing is not all that reduce can do. In fact, I find this example a little dangerous. Everybody I showed this to, immediately understood, "Aah, so that's what a reduce is", but unortunately some also thought that summing numbers is all reduce is, and that's definitely not the case. In fact, reduce is a general method of iteration, by which I mean that reduce can do anything that each can do. In particular, you can store arbitrary state in the accumulator.

For example, I wrote above that reduce reduces the collection down to a single value. But of course that "single value" can be arbitrarily complex. It could, for example, be itself a collection. Or a string:

class Array
  def mystery_method(foo)
    drop(1).reduce("#{first}") {|s, el| s << foo.to_str << el.to_s }
  end
end

This is an example how far you can go with playing tricks with the accumulator. If you try it out, you'll of course recognize it as Array#join:

class Array
  def join(sep=$,)
    drop(1).reduce("#{first}") {|s, el| s << sep.to_str << el.to_s }
  end
end

Note that nowhere in this "loop" do I have to keep track of whether I'm at the last or second-to-last element. Nor is there any conditional in the code. There is no potential for fencepost errors here. If you think about how to implement this with each, you would have to somehow keep track of the index and check whether you are at the last element and then have an if in there, to prevent emitting the separator at the end.

Since I wrote above that all iteration can be done with reduce, I might just as well prove it. Here's Ruby's Enumerable methods, implemented in terms of reduce instead of each as they normally are. (Note that I only just started and have only arrived at g yet.)

module Enumerable
  def all?
    reduce(true) {|res, el| res && yield(el) }
  end

  def any?
    reduce(false) {|res, el| res || yield(el) }
  end

  alias_method :map, def collect
    reduce([]) {|res, el| res << yield(el) }
  end

  def count
    reduce(0) {|res, el| res + 1 if yield el }
  end

  alias_method :find, def detect
    reduce(nil) {|res, el| if yield el then el end unless res }
  end

  def drop(n=1)
    reduce([]) {|res, el| res.tap {|res| res << el unless n -= 1 >= 0 }}
  end

  def drop_while
    reduce([]) {|res, el| res.tap {|res| res << el unless yield el }}
  end

  def each
    reduce(nil) {|_, el| yield el }
  end

  def each_with_index
    tap { reduce(-1) {|i, el| (i+1).tap {|i| yield el, i }}}
  end

  alias_method :select, def find_all
    reduce([]) {|res, el| res.tap {|res| res << el if yield el }}
  end

  def grep(pattern)
    reduce([]) {|res, el| res.tap {|res| res << yield(el) if pattern === el }}
  end

  def group_by
    reduce(Hash.new {|hsh, key| hsh[key] = [] }) {|res, el| res.tap {|res|
        res[yield el] = el
    }}
  end

  def include?(obj)
    reduce(false) {|res, el| break true if res || el == obj }
  end

  def reject
    reduce([]) {|res, el| res.tap {|res| res << el unless yield el }}
  end
end

[Note: I made some simplifications for the purpose of this post. For example, according to the standard Ruby Enumerable protocol, each is supposed to return self, so you'd have to slap an extra line in there; other methods behave slightly differently, depending on what kind and how many arguments you pass in and so on. I left those out because they distract from the point I am trying to make.]

like image 196
Jörg W Mittag Avatar answered Oct 02 '22 00:10

Jörg W Mittag


They're talking about more specialized methods such as map, filter or inject. For example, instead of this:

even_numbers = []
numbers.each {|num| even_numbers << num if num.even?}

You should do this:

even_numbers = numbers.select {|num| num.even?}

It says what you want to do but encapsulates all the irrelevant technical details in the select method. (And incidentally, in Ruby 1.8.7 or later, you can just write even_numbers = numbers.select(&:even?), so even more concise if slightly Perl-like.)

These aren't normally called "higher-order iterators," but whoever wrote that probably just had a minor mental mixup. It's a good principle whatever terminology you use.

like image 33
Chuck Avatar answered Oct 02 '22 01:10

Chuck