I want a convenient way to generate an Iterable
, given a initial object and a function to produce the next object from the current one, that consumes O(1) memory (i.e., it doesn't cache old results; if you want to iterate a second time, the function has to be applied again).
It doesn't seem like there's library support for this. In Scala 2.8, the method scala.collection.Iterable.iterate
has signature
def iterate [A] (start: A, len: Int)(f: (A) ⇒ A) : Iterable[A]
so it requires that you specify how many iterated function applications you're interested in ahead of time, and my understanding of the documentation is that Iterable.iterate
actually computes all these values immediately. On the other hand, the method scala.collection.Iterator.iterate
has signature
def iterate [T] (start: T)(f: (T) ⇒ T) : Iterator[T]
which looks great, but we only get an Iterator
which doesn't offer all the convenience of map
, filter
and friends.
Is there a convenient library method to produce what I want?
and if not,
Can someone suggest the 'colloquial' Scala code for doing this?
To summarise, given an initial object a: A
, and a function f: A => A
, I'd like a TraversableLike
(e.g., probably an Iterable
) that generates a, f(a), f(f(a)), ...
, and uses O(1) memory, with map
, filter
etc. functions that also return something that is O(1) in memory.
Stream
will do what you want, just don't hold on to the cells; only iterate over the values.
It is a sadly common misconception floating around that Streams inherently cache every value they compute.
If you write this:
val s1: Stream[Thing] = initialValue #:: «expression computing next value»
then indeed every value produced by the stream is retained, but this is not necessary. If you write:
def s2: Stream[Thing] = initialValue #:: «expression computing next value»
and if the caller just iterates over the stream's values but does not remember the Stream value itself (specifically any of its cons cells), then no unwanted retention will occur. Of course, in this formulation, every call creates a new Stream
starting from a fixed initial value. That's not necessary:
def s3(start: Thing): Stream[Thing] = start #:: «expression computing next value»
The one thing you have to watch out for is passing a Stream
to a method. Doing so will capture the head of the stream passed in the method parameter. One way around this is to process the stream with tail-recursive code.
Iterator.iterate
demo with filter:
object I {
def main(args:Array[String]) {
val mb = 1024 * 1024
val gen = Iterator.iterate(new Array[Int](10 * mb)){arr =>
val res = new Array[Int](10 * mb)
arr.copyToArray(res)
println("allocated 10mb")
res(0) = arr(0) + 1 // store iteration count in first elem of new array
res
}
// take 1 out of 100
val gen2 = gen filter (arr => arr(0) % 100 == 0)
// print first 10 filtered
gen2.take(10).foreach { arr => println("filtered " + arr(0)) }
}
}
(this may not work in the REPL as the PRINT step may mess up with memory management)
JAVA_OPTS="-Xmx128m" scala -cp classes I
will show that the filtering works and is lazy. If it wasn't done in constant memory that would cause a heap error (since it's allocating something like 900*10mb).
Use JAVA_OPTS="-Xmx128m -verbose:gc" scala -cp classes I
to see the garbage collection events.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With