Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I execute multiple tasks in Scala?

I have 50,000 tasks and want to execute them with 10 threads. In Java I should create Executers.threadPool(10) and pass runnable to is then wait to process all. Scala as I understand especially useful for that task, but I can't find solution in docs.

like image 836
yura Avatar asked Dec 22 '10 16:12

yura


People also ask

Does Scala support Multithreading?

Multithreading is used to develop concurrent applications in Scala. Threads in Scala can be created by using two mechanisms : Extending the Thread class. Extending the Runnable Interface.

Is used to execute multiple tasks simultaneously?

In computing, multitasking is the concurrent execution of multiple tasks (also known as processes) over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end.

What is task in multi threading?

Tasks uses the Thread Pool behind the scenes but make better use of the threads depending on the number in use. The main object in the TPL is a Task. This is a class that represents an asynchronous operation. The commonest way to start things running is with the Task.


2 Answers

Scala 2.9.3 and later

THe simplest approach is to use the scala.concurrent.Future class and associated infrastructure. The scala.concurrent.future method asynchronously evaluates the block passed to it and immediately returns a Future[A] representing the asynchronous computation. Futures can be manipulated in a number of non-blocking ways, including mapping, flatMapping, filtering, recovering errors, etc.

For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.

import scala.concurrent.duration._ import scala.concurrent.ExecutionContext.Implicits.global  val tasks: Seq[Future[Int]] = for (i <- 1 to 10) yield future {   println("Executing task " + i)   Thread.sleep(i * 1000L)   i * i }  val aggregated: Future[Seq[Int]] = Future.sequence(tasks)  val squares: Seq[Int] = Await.result(aggregated, 15.seconds) println("Squares: " + squares) 

In this example, we first create a sequence of individual asynchronous tasks that, when complete, provide an int. We then use Future.sequence to combine those async tasks in to a single async task -- swapping the position of the Future and the Seq in the type. Finally, we block the current thread for up to 15 seconds while waiting for the result. In the example, we use the global execution context, which is backed by a fork/join thread pool. For non-trivial examples, you probably would want to use an application specific ExecutionContext.

Generally, blocking should be avoided when at all possible. There are other combinators available on the Future class that can help program in an asynchronous style, including onSuccess, onFailure, and onComplete.

Also, consider investigating the Akka library, which provides actor-based concurrency for Scala and Java, and interoperates with scala.concurrent.

Scala 2.9.2 and prior

This simplest approach is to use Scala's Future class, which is a sub-component of the Actors framework. The scala.actors.Futures.future method creates a Future for the block passed to it. You can then use scala.actors.Futures.awaitAll to wait for all tasks to complete.

For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.

import scala.actors.Futures._  val tasks = for (i <- 1 to 10) yield future {   println("Executing task " + i)   Thread.sleep(i * 1000L)   i * i }  val squares = awaitAll(20000L, tasks: _*) println("Squares: " + squares) 
like image 58
mpilquist Avatar answered Sep 18 '22 21:09

mpilquist


You want to look at either the Scala actors library or Akka. Akka has cleaner syntax, but either will do the trick.

So it sounds like you need to create a pool of actors that know how to process your tasks. An Actor can basically be any class with a receive method - from the Akka tutorial (http://doc.akkasource.org/tutorial-chat-server-scala):

class MyActor extends Actor {   def receive = {     case "test" => println("received test")     case _ =>      println("received unknown message")  }}  val myActor = Actor.actorOf[MyActor] myActor.start 

You'll want to create a pool of actor instances and fire your tasks to them as messages. Here's a post on Akka actor pooling that may be helpful: http://vasilrem.com/blog/software-development/flexible-load-balancing-with-akka-in-scala/

In your case, one actor per task may be appropriate (actors are extremely lightweight compared to threads so you can have a LOT of them in a single VM), or you might need some more sophisticated load balancing between them.

EDIT: Using the example actor above, sending it a message is as easy as this:

myActor ! "test" 

The actor will then output "received test" to standard output.

Messages can be of any type, and when combined with Scala's pattern matching, you have a powerful pattern for building flexible concurrent applications.

In general Akka actors will "do the right thing" in terms of thread sharing, and for the OP's needs, I imagine the defaults are fine. But if you need to, you can set the dispatcher the actor should use to one of several types:

* Thread-based * Event-based * Work-stealing * HawtDispatch-based event-driven 

It's trivial to set a dispatcher for an actor:

class MyActor extends Actor {   self.dispatcher = Dispatchers.newExecutorBasedEventDrivenDispatcher("thread-pool-dispatch")     .withNewThreadPoolWithBoundedBlockingQueue(100)     .setCorePoolSize(10)     .setMaxPoolSize(10)     .setKeepAliveTimeInMillis(10000)     .build } 

See http://doc.akkasource.org/dispatchers-scala

In this way, you could limit the thread pool size, but again, the original use case could probably be satisfied with 50K Akka actor instances using default dispatchers and it would parallelize nicely.

This really only scratches the surface of what Akka can do. It brings a lot of what Erlang offers to the Scala language. Actors can monitor other actors and restart them, creating self-healing applications. Akka also provides Software Transactional Memory and many other features. It's arguably the "killer app" or "killer framework" for Scala.

like image 30
Janx Avatar answered Sep 22 '22 21:09

Janx