Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala Parallel Collections- How to return early?

I have a list of possible input Values

val inputValues = List(1,2,3,4,5)

I have a really long to compute function that gives me a result

def reallyLongFunction( input: Int ) : Option[String] = { ..... }

Using scala parallel collections, I can easily do

inputValues.par.map( reallyLongFunction( _ ) )

To get what all the results are, in parallel. The problem is, I don't really want all the results, I only want the FIRST result. As soon as one of my input is a success, I want my output, and want to move on with my life. This did a lot of extra work.

So how do I get the best of both worlds? I want to

  1. Get the first result that returns something from my long function
  2. Stop all my other threads from useless work.

Edit - I solved it like a dumb java programmer by having

@volatile var done = false;

Which is set and checked inside my reallyLongFunction. This works, but does not feel very scala. Would like a better way to do this....

like image 238
bwawok Avatar asked Dec 11 '11 22:12

bwawok


1 Answers

(Updated: no, it doesn't work, doesn't do the map)

Would it work to do something like:

inputValues.par.find({ v => reallyLongFunction(v); true })

The implementation uses this:

  protected[this] class Find[U >: T](pred: T => Boolean, protected[this] val pit: IterableSplitter[T]) extends Accessor[Option[U], Find[U]] {
    @volatile var result: Option[U] = None
    def leaf(prev: Option[Option[U]]) = { if (!pit.isAborted) result = pit.find(pred); if (result != None) pit.abort }
    protected[this] def newSubtask(p: IterableSplitter[T]) = new Find(pred, p)
    override def merge(that: Find[U]) = if (this.result == None) result = that.result
  }

which looks pretty similar in spirit to your @volatile except you don't have to look at it ;-)

like image 119
Havoc P Avatar answered Sep 28 '22 00:09

Havoc P