Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-strict, Immutable, Non-memoized Infinite series in Scala

I want an infinite non-strict series x1, x2, x3... that I can work with one element at a time, not memoizing the results in order to keep my memory usage constant. For the sake of specificity let's say it's a series of integers (e.g. the natural numbers, the odd numbers, the primes), though this question might apply to more general data types.

The easiest way to work with infinite lists is to use Scala's Stream object. A common idiom is to write a function that returns a Stream, uses the #:: operator to add a term to the series, and then calls itself recursively. For example, the following generates an infinite stream of integers given a starting value and a successor function.

  def infiniteList(n: Int, f: Int => Int): Stream[Int] = {
      n #:: infiniteList(f(n), f)
  }
  infiniteList(2, _*2+3).take(10) print
  // returns 2, 7, 17, 37, 77, 157, 317, 637, 1277, 2557, empty

(I realize that the above is equivalent to the library call Stream.iterate(2)(_*2+3). I wrote it here as example of this infinite Stream idiom.)

However, streams memoize their results, making their memory requirements non-constant, and potentially very large. The documentation states that memoization is avoided if you don't hold on to the head of a Stream, but in practice this can be tricky. I may implement infinite list code in which I don't think I'm holding on to any stream heads, but if it still has unbounded memory requirements I have to figure out if the problem is that I'm handling my streams in some way that does cause memoization, or if it is something else. This can be a difficult debugging task and has code smell because I'm trying to trick an explicitly memoized data structure into returning a non-memoized result.

What I want is something with the semantics of Stream expect without memoization. It doesn't appear that such an object exists in Scala. I've been experimenting with using iterators to implement infinite numeric series, but the mutability of iterators makes this tricky when you start wanting to do comprehension operations on them. I've also tried to write my own code from scratch, but it's not clear where I should start (do I subclass Traversable?) or how to avoid reimplementing the functionality in map, fold, etc.

Does someone have good example Scala code of an implementation of a non-strict, immutable, non-memoized infinite list?

More generally, I understand the semantics of traversable, iterable, sequence, stream, and view, but the fact that I find this question so vexing makes me feel like I'm misunderstanding something. It seems to me that non-strictness and non-memoization are completely orthogonal properties, but Scala seems to have made a design decision to unify them in Stream and given no easy way to peel them apart. Is this an oversight on Scala's part, or is there some deep connection between non-strictness and non-memoization that I am overlooking?


I realize the question is fairly abstract. Here is some additional context that ties it to a specific problem.

I am encountering this issue in the course of implementing a prime number generator as described in Meissa O'Niell's paper "The Genuine Sieve of Eratosthenes", and it's difficult to give a simple example of where an Iterator object is inadequate without pulling in a lot of the details from that paper. Here is a complete implementation using streams that is very elegant but has suspiciously large memory consumption.

Here is a simplified implementation with iterators that does not compile but gives you an idea of what I want.

import scala.collection.mutable

object ONeillSieve {

  class NumericSeries extends BufferedIterator[Int] with Ordered[NumericSeries] {
    def hasNext = true

    def compare(that: NumericSeries) = that.head.compare(head)

    override def toString() = head + "..."

    var head = 3

    def next() = {
      val r = head
      head += 2
      r
   }
 }

def main(args: Array[String]) {
    val q = mutable.PriorityQueue[NumericSeries]()

    val odds = new NumericSeries

    q += odds.map(odds.head * _)
    odds.next()
    q += odds.map(odds.head * _)

    println("Sieve = %s\nInput = %s".format(q, odds))
  }
}

I need to build a PriorityQueue of infinite numeric series keyed by their smallest element. (Hence I use a BufferedIterator instead of just a plain Iterator.) Also note that here the basis for infinite series is the odd integers but the most general solution involves more complicated series. At the end of the main function I want the queue to contain two infinite series:

  1. 3x(odds starting from 3) (i.e. 9,12,15...)
  2. 5x(odds starting from 5) (i.e. 25,30,35...)

The above does not compile because odds.map(...) returns an Iterator, not a NumericSeries and hence cannot be added to the priority queue.

At this point it looks like I'm wading into collections classes extensions, and that's tricky so I want to make sure I'm not having to do it unless absolutely necessary.

like image 314
W.P. McNeill Avatar asked Jan 12 '13 21:01

W.P. McNeill


1 Answers

EDIT: The below doesn't preserve the Generator type when using filter or map; indeed trying to implement a full 'MyType' for the generator is more or less impossible (look into IndexedSeqView source code to see the mess).

But there is an even easier way around (see my third answer)


Ok, my second try. In order to keep the lazy behaviour for map, filter, etc., the best might be to subclass SeqView or StreamView:

import collection.immutable.StreamView

final case class Generator[A](override val head: A, fun: A => A)
extends StreamView[A, Generator[A]] {
  protected def underlying = this
  def length: Int = Int.MaxValue  // ?
  def iterator = Iterator.iterate(head)(fun)
  def apply(idx: Int): A = {
    if(idx < 0) throw new IndexOutOfBoundsException(idx.toString)
    var res = head
    var i = idx; while(i > 0) {
      res = fun(res)
      i -= 1
    }
    res
  }
}

(I took Rex' suggestion to call this "Generator").

val i = Generator[Int](2, _ * 2 + 3)
i.take(4).foreach(println)
val j = i.map(_ * 0.5)
j.take(4).foreach(println)
like image 59
0__ Avatar answered Sep 23 '22 03:09

0__