I want an infinite non-strict series x1, x2, x3... that I can work with one element at a time, not memoizing the results in order to keep my memory usage constant. For the sake of specificity let's say it's a series of integers (e.g. the natural numbers, the odd numbers, the primes), though this question might apply to more general data types.
The easiest way to work with infinite lists is to use Scala's Stream
object. A common idiom is to write a function that returns a Stream
, uses the #::
operator to add a term to the series, and then calls itself recursively. For example, the following generates an infinite stream of integers given a starting value and a successor function.
def infiniteList(n: Int, f: Int => Int): Stream[Int] = {
n #:: infiniteList(f(n), f)
}
infiniteList(2, _*2+3).take(10) print
// returns 2, 7, 17, 37, 77, 157, 317, 637, 1277, 2557, empty
(I realize that the above is equivalent to the library call Stream.iterate(2)(_*2+3)
. I wrote it here as example of this infinite Stream
idiom.)
However, streams memoize their results, making their memory requirements non-constant, and potentially very large. The documentation states that memoization is avoided if you don't hold on to the head of a Stream
, but in practice this can be tricky. I may implement infinite list code in which I don't think I'm holding on to any stream heads, but if it still has unbounded memory requirements I have to figure out if the problem is that I'm handling my streams in some way that does cause memoization, or if it is something else. This can be a difficult debugging task and has code smell because I'm trying to trick an explicitly memoized data structure into returning a non-memoized result.
What I want is something with the semantics of Stream
expect without memoization. It doesn't appear that such an object exists in Scala. I've been experimenting with using iterators to implement infinite numeric series, but the mutability of iterators makes this tricky when you start wanting to do comprehension operations on them. I've also tried to write my own code from scratch, but it's not clear where I should start (do I subclass Traversable
?) or how to avoid reimplementing the functionality in map
, fold
, etc.
Does someone have good example Scala code of an implementation of a non-strict, immutable, non-memoized infinite list?
More generally, I understand the semantics of traversable, iterable, sequence, stream, and view, but the fact that I find this question so vexing makes me feel like I'm misunderstanding something. It seems to me that non-strictness and non-memoization are completely orthogonal properties, but Scala seems to have made a design decision to unify them in Stream
and given no easy way to peel them apart. Is this an oversight on Scala's part, or is there some deep connection between non-strictness and non-memoization that I am overlooking?
I realize the question is fairly abstract. Here is some additional context that ties it to a specific problem.
I am encountering this issue in the course of implementing a prime number generator as described in Meissa O'Niell's paper "The Genuine Sieve of Eratosthenes", and it's difficult to give a simple example of where an Iterator
object is inadequate without pulling in a lot of the details from that paper. Here is a complete implementation using streams that is very elegant but has suspiciously large memory consumption.
Here is a simplified implementation with iterators that does not compile but gives you an idea of what I want.
import scala.collection.mutable
object ONeillSieve {
class NumericSeries extends BufferedIterator[Int] with Ordered[NumericSeries] {
def hasNext = true
def compare(that: NumericSeries) = that.head.compare(head)
override def toString() = head + "..."
var head = 3
def next() = {
val r = head
head += 2
r
}
}
def main(args: Array[String]) {
val q = mutable.PriorityQueue[NumericSeries]()
val odds = new NumericSeries
q += odds.map(odds.head * _)
odds.next()
q += odds.map(odds.head * _)
println("Sieve = %s\nInput = %s".format(q, odds))
}
}
I need to build a PriorityQueue
of infinite numeric series keyed by their smallest element. (Hence I use a BufferedIterator
instead of just a plain Iterator
.) Also note that here the basis for infinite series is the odd integers but the most general solution involves more complicated series. At the end of the main function I want the queue to contain two infinite series:
The above does not compile because odds.map(...)
returns an Iterator
, not a NumericSeries
and hence cannot be added to the priority queue.
At this point it looks like I'm wading into collections classes extensions, and that's tricky so I want to make sure I'm not having to do it unless absolutely necessary.
EDIT: The below doesn't preserve the Generator
type when using filter
or map
; indeed trying to implement a full 'MyType' for the generator is more or less impossible (look into IndexedSeqView
source code to see the mess).
But there is an even easier way around (see my third answer)
Ok, my second try. In order to keep the lazy behaviour for map
, filter
, etc., the best might be to subclass SeqView
or StreamView
:
import collection.immutable.StreamView
final case class Generator[A](override val head: A, fun: A => A)
extends StreamView[A, Generator[A]] {
protected def underlying = this
def length: Int = Int.MaxValue // ?
def iterator = Iterator.iterate(head)(fun)
def apply(idx: Int): A = {
if(idx < 0) throw new IndexOutOfBoundsException(idx.toString)
var res = head
var i = idx; while(i > 0) {
res = fun(res)
i -= 1
}
res
}
}
(I took Rex' suggestion to call this "Generator").
val i = Generator[Int](2, _ * 2 + 3)
i.take(4).foreach(println)
val j = i.map(_ * 0.5)
j.take(4).foreach(println)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With