I have a Traversable, and I want to make it into a Java Iterator. My problem is that I want everything to be lazily done. If I do .toIterator on the traversable, it eagerly produces the result, copies it into a List, and returns an iterator over the List.
I'm sure I'm missing something simple here...
Here is a small test case that shows what I mean:
class Test extends Traversable[String] {
def foreach[U](f : (String) => U) {
f("1")
f("2")
f("3")
throw new RuntimeException("Not lazy!")
}
}
val a = new Test
val iter = a.toIterator
An Iterator is an object that can be used to loop through collections, like ArrayList and HashSet. It is called an "iterator" because "iterating" is the technical term for looping. To use an Iterator, you must import it from the java.util package.
Generally, an iterator in Java is used to loop through any collection of objects. To apply the iterator, all you need to do is import the java. util package and then use the iterator() method. You can then use the iterator to perform multiple operations in the collection.
An element can be removed from a Collection using the Iterator method remove(). This method removes the current element in the Collection. If the remove() method is not preceded by the next() method, then the exception IllegalStateException is thrown.
Object next(): It returns the next element in the collection until the hasNext()method return true. This method throws 'NoSuchElementException' if there is no next element. void remove(): It removes the current element in the collection.
The reason you can't get lazily get an iterator from a traversable is that you intrinsically can't. Traversable defines foreach
, and foreach
runs through everything without stopping. No laziness there.
So you have two options, both terrible, for making it lazy.
First, you can iterate through the whole thing each time. (I'm going to use the Scala Iterator, but the Java Iterator is basically the same.)
class Terrible[A](t: Traversable[A]) extends Iterator[A] {
private var i = 0
def hasNext = i < t.size // This could be O(n)!
def next: A = {
val a = t.slice(i,i+1).head // Also could be O(n)!
i += 1
a
}
}
If you happen to have efficient indexed slicing, this will be okay. If not, each "next" will take time linear in the length of the iterator, for O(n^2)
time just to traverse it. But this is also not necessarily lazy; if you insist that it must be you have to enforce O(n^2)
in all cases and do
class Terrible[A](t: Traversable[A]) extends Iterator[A] {
private var i = 0
def hasNext: Boolean = {
var j = 0
t.foreach { a =>
j += 1
if (j>i) return true
}
false
}
def next: A = {
var j = 0
t.foreach{ a =>
j += 1
if (j>i) { i += 1; return a }
}
throw new NoSuchElementException("Terribly empty")
}
}
This is clearly a terrible idea for general code.
The other way to go is to use a thread and block the traversal of foreach as it's going. That's right, you have to do inter-thread communication on every single element access! Let's see how that works--I'm going to use Java threads here since Scala is in the middle of a switch to Akka-style actors (though any of the old actors or the Akka actors or the Scalaz actors or the Lift actors or (etc.) will work)
class Horrible[A](t: Traversable[A]) extends Iterator[A] {
private val item = new java.util.concurrent.SynchronousQueue[Option[A]]()
private class Loader extends Thread {
override def run() { t.foreach{ a => item.put(Some(a)) }; item.put(None) }
}
private val loader = new Loader
loader.start
private var got: Option[A] = null
def hasNext: Boolean = {
if (got==null) { got = item.poll; hasNext }
else got.isDefined
}
def next = {
if (got==null) got = item.poll
val ans = got.get
got = null
ans
}
}
This avoids the O(n^2)
disaster, but ties up a thread and has desperately slow element-by-element access. I get about two million accesses per second on my machine, as compared to >100M for a typical traversable. This is clearly a horrible idea for general code.
So there you have it. Traversable is not lazy in general, and there is no good way to make it lazy without compromising performance tremendously.
I've run into this problem before and as far as I can tell, no one's particularly interested in making it easier to get an Iterator
when all you've defined is foreach
.
But as you've noted, toStream
is the problem, so you could just override that:
class Test extends Traversable[String] {
def foreach[U](f: (String) => U) {
f("1")
f("2")
f("3")
throw new RuntimeException("Not lazy!")
}
override def toStream: Stream[String] = {
"1" #::
"2" #::
"3" #::
Stream[String](throw new RuntimeException("Not lazy!"))
}
}
Another alternative would be to define an Iterable
instead of a Traversable
, and then you'd get the iterator
method directly. Could you explain a bit more what your Traversable
is doing in your real use case?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With