Pattern matching and infinite streams

Tags:

So, I'm working to teach myself Scala, and one of the things I've been playing with is the Stream class. I tried to use a naïve translation of the classic Haskell version of Dijkstra's solution to the Hamming number problem:

object LazyHammingBad {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }

  val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 },
      merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
}

Taking this for a spin in the interpreter led quickly to disappointment:

scala> LazyHammingBad.numbers.take(10).toList
java.lang.StackOverflowError

I decided to look to see if other people had solved the problem in Scala using the Haskell approach, and adapted this solution from Rosetta Code:

object LazyHammingGood {
  private def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    if (a.head < b.head) a.head #:: merge(a.tail, b)
    else if (b.head < a.head) b.head #:: merge(a, b.tail)
    else a.head #:: merge(a.tail, b.tail)

  val numbers: Stream[BigInt] = 
    1 #:: merge(numbers map {_ * 2}, 
            merge(numbers map {_ * 3}, numbers map {_ * 5}))
}

This one worked nicely, but I still wonder how I went wrong in LazyHammingBad. Does using #:: to destructure x #:: xs force the evaluation of xs for some reason? Is there any way to use pattern matching safely with infinite streams, or do you just have to use head and tail if you don't want things to blow up?

860

asked Sep 20 '11 22:09

Pillsy

2 Answers

a match {case x#::xs =>... is about the same as val (x, xs) = (a.head, a.tail). So the difference between the bad version and the good one, is that in that in the bad version, you're calling a.tail and b.tail right at the start, instead of just use them to build the tail of the resulting stream. Furthermore when you use them at the right of #:: (not pattern matching, but building the result, as in #:: merge(a.b.tail) you are not actually calling merge, that will be done only later, when accessing the tail of the returned Stream. So in the good version, a call to merge does not call tail at all. In the bad version, it calls it right at start.

Now if you consider numbers, or even a simplified version, say 1 #:: merge(numbers, anotherStream), when you call you call tail on that (as take(10) will), merge has to be evaluated. You call tail on numbers, which call merge with numbers as parameters, which calls tails on numbers, which calls merge, which calls tail...

By contrast, in super lazy Haskell, when you pattern match, it does barely any work. When you do case l of x:xs, it will evaluate l just enough to know whether it is an empty list or a cons. If it is indeed a cons, x and xs will be available as two thunks, functions that will eventually give access, later, to content. The closest equivalent in Scala would be to just test empty.

Note also that in Scala Stream, while the tail is lazy, the head is not. When you have a (non empty) Stream, the head has to be known. Which means that when you get the tail of the stream, itself a stream, its head, that is the second element of the original stream, has to be computed. This is sometimes problematic, but in your example, you fail before even getting there.

135

answered Oct 28 '22 03:10

Didier Dupont

Note that you can do what you want by defining a better pattern matcher for Stream:

Here's a bit I just pulled together in a Scala Worksheet:

object HammingTest {
  // A convenience object for stream pattern matching
  object #:: {
    class TailWrapper[+A](s: Stream[A]) {
      def unwrap = s.tail
    }
    object TailWrapper {
      implicit def unwrap[A](wrapped: TailWrapper[A]) = wrapped.unwrap
    }
    def unapply[A](s: Stream[A]): Option[(A, TailWrapper[A])] = {
      if (s.isEmpty) None
      else {
        Some(s.head, new TailWrapper(s))
      }
    }
  }

  def merge(a: Stream[BigInt], b: Stream[BigInt]): Stream[BigInt] =
    (a, b) match {
      case (x #:: xs, y #:: ys) =>
        if (x < y) x #:: merge(xs, b)
        else if (y < x) y #:: merge(a, ys)
        else x #:: merge(xs, ys)
    }                                             //> merge: (a: Stream[BigInt], b: Stream[BigInt])Stream[BigInt]

  lazy val numbers: Stream[BigInt] =
    1 #:: merge(numbers map { _ * 2 }, merge(numbers map { _ * 3 }, numbers map { _ * 5 }))
                                                  //> numbers  : Stream[BigInt] = <lazy>
  numbers.take(10).toList                         //> res0: List[BigInt] = List(1, 2, 3, 4, 5, 6, 8, 9, 10, 12)
}

Now you just need to make sure that Scala finds your object #:: instead of the one in Stream.class whenever it's doing pattern matching. To facilitate that, it might be best to use a different name like #>: or ##:: and then just remember to always use that name when pattern matching.

If you ever need to match the empty stream, use case Stream.Empty. Using case Stream() will attempt to evaluate your entire stream there in the pattern match, which will lead to sadness.

answered Oct 28 '22 05:10

Daniel Martin

Related questions
                            
                                How to use common/shared "blocks" between cucumber features?
                            
                                Proper way to generate html dynamically with jQuery
                            
                                Placeholder attribute not supported in IE. Any suggestions?
                            
                                rspec `its` syntax with dynamic conditions
                            
                                Can Native Client (NaCl) programs be written in languages other than C or C++?
                            
                                my own id in GORM
                            
                                Android: "trying to use a recycled bitmap" error with temporary Bitmaps
                            
                                Facebook authentication in a UIWebView does not redirect back to original page on my site asking for auth
                            
                                SQL DataType Negative Decimal
                            
                                a list of dynamic functions and dynamically calling them
                            
                                Why no destructing in def form?
                            
                                Exposing a C++ class instance to a python embedded interpreter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With