Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Traversable => Java Iterator

I have a Traversable, and I want to make it into a Java Iterator. My problem is that I want everything to be lazily done. If I do .toIterator on the traversable, it eagerly produces the result, copies it into a List, and returns an iterator over the List.

I'm sure I'm missing something simple here...

Here is a small test case that shows what I mean:

class Test extends Traversable[String] {
      def foreach[U](f : (String) => U) {
         f("1")
         f("2")
         f("3")
         throw new RuntimeException("Not lazy!")
     }
}

val a = new Test
val iter = a.toIterator
like image 598
Andres Avatar asked Nov 02 '12 09:11

Andres


People also ask

What is iterator () in Java?

An Iterator is an object that can be used to loop through collections, like ArrayList and HashSet. It is called an "iterator" because "iterating" is the technical term for looping. To use an Iterator, you must import it from the java.util package.

How do I use an iterator in Java?

Generally, an iterator in Java is used to loop through any collection of objects. To apply the iterator, all you need to do is import the java. util package and then use the iterator() method. You can then use the iterator to perform multiple operations in the collection.

What is iterator remove ()?

An element can be removed from a Collection using the Iterator method remove(). This method removes the current element in the Collection. If the remove() method is not preceded by the next() method, then the exception IllegalStateException is thrown.

What is the return type of next () in iterator?

Object next(): It returns the next element in the collection until the hasNext()method return true. This method throws 'NoSuchElementException' if there is no next element. void remove(): It removes the current element in the collection.


2 Answers

The reason you can't get lazily get an iterator from a traversable is that you intrinsically can't. Traversable defines foreach, and foreach runs through everything without stopping. No laziness there.

So you have two options, both terrible, for making it lazy.

First, you can iterate through the whole thing each time. (I'm going to use the Scala Iterator, but the Java Iterator is basically the same.)

class Terrible[A](t: Traversable[A]) extends Iterator[A] {
  private var i = 0
  def hasNext = i < t.size   // This could be O(n)!
  def next: A = {
    val a = t.slice(i,i+1).head  // Also could be O(n)!
    i += 1
    a
  }
}

If you happen to have efficient indexed slicing, this will be okay. If not, each "next" will take time linear in the length of the iterator, for O(n^2) time just to traverse it. But this is also not necessarily lazy; if you insist that it must be you have to enforce O(n^2) in all cases and do

class Terrible[A](t: Traversable[A]) extends Iterator[A] {
  private var i = 0
  def hasNext: Boolean = {
    var j = 0
    t.foreach { a =>
      j += 1
      if (j>i) return true
    }
    false
  }
  def next: A = { 
    var j = 0
    t.foreach{ a => 
      j += 1
      if (j>i) { i += 1; return a }
    }
    throw new NoSuchElementException("Terribly empty")
  }
}

This is clearly a terrible idea for general code.

The other way to go is to use a thread and block the traversal of foreach as it's going. That's right, you have to do inter-thread communication on every single element access! Let's see how that works--I'm going to use Java threads here since Scala is in the middle of a switch to Akka-style actors (though any of the old actors or the Akka actors or the Scalaz actors or the Lift actors or (etc.) will work)

class Horrible[A](t: Traversable[A]) extends Iterator[A] {
  private val item = new java.util.concurrent.SynchronousQueue[Option[A]]()
  private class Loader extends Thread {
    override def run() { t.foreach{ a => item.put(Some(a)) }; item.put(None) }
  }
  private val loader = new Loader
  loader.start
  private var got: Option[A] = null
  def hasNext: Boolean = {
    if (got==null) { got = item.poll; hasNext }
    else got.isDefined
  }
  def next = {
    if (got==null) got = item.poll
    val ans = got.get
    got = null
    ans
  }
}

This avoids the O(n^2) disaster, but ties up a thread and has desperately slow element-by-element access. I get about two million accesses per second on my machine, as compared to >100M for a typical traversable. This is clearly a horrible idea for general code.

So there you have it. Traversable is not lazy in general, and there is no good way to make it lazy without compromising performance tremendously.

like image 199
Rex Kerr Avatar answered Nov 15 '22 06:11

Rex Kerr


I've run into this problem before and as far as I can tell, no one's particularly interested in making it easier to get an Iterator when all you've defined is foreach.

But as you've noted, toStream is the problem, so you could just override that:

class Test extends Traversable[String] {
  def foreach[U](f: (String) => U) {
    f("1")
    f("2")
    f("3")
    throw new RuntimeException("Not lazy!")
  }
  override def toStream: Stream[String] = {
    "1" #::
    "2" #::
    "3" #::
    Stream[String](throw new RuntimeException("Not lazy!"))
  }
}

Another alternative would be to define an Iterable instead of a Traversable, and then you'd get the iterator method directly. Could you explain a bit more what your Traversable is doing in your real use case?

like image 29
Steve Avatar answered Nov 15 '22 06:11

Steve