Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala Future vs Thread for a long running task without result

I want to build a simple server in Scala 2.11 that listens on a socket. It should asynchronously read data from the socket and pass the data into a Observable from RxScala.

I have a Java ServerSocket from which the data should be read with the method readData which is blocking. This method is started once and runs until the whole program stops:

val server = new ServerSocket(port)
def readData(server: ServerSocket): Unit = ???

I found two different ways to avoid blocking the whole program, when the data is read from the socket:

new Thread {
  override def run(): Unit = {
    readData(server)
  }
}.start()

and

Future {
  blocking {
    readData(server)
  }
}

Because there is no return value wrapped in the Future, which can then be passed to other tasks, the only task for the Future is to make the computation non blocking. So I wonder if there are any larger differences between these approaches? Looking into implementation of the Future it looks like it also creates and runs a Runnable with the given block. So is one of these approaches preferable if one has a single and forever/long running task without a result?

like image 833
nicoring Avatar asked Jul 27 '16 15:07

nicoring


1 Answers

So is one of these approaches preferable if one has a long or forever running task without a result?

The two examples differ in the fact that the former allocates a new thread per requests, and the second example implicitly uses Scala's default ExecutionContext which is backed by a ForkJoinPool, which is basically a pool of threads that can scale up/down as needed.

One needs to remember that a thread doesn't come for free, it needs to allocate a stack (which varies based on the OS), system calls that need to be made to register that thread, etc (you can read more in Why is creating a Thread said to be expensive?).

Generally, I'd go with the latter "naive" approach of using a Future with the blocking which utilizes the global ExecutionContext. Most importantly, I'd benchmark my code to make sure I am satisfied with the way the code is behaving, and then make adjustments based on those findings.

Another important thing to note: Using a thread or a threadpool to process events still doesn't make ot asynchronous, you're simply using multiple thread to process blocking sync IO. I'm not familiar with the SocketServer API, but in general if it exposes a naturally async API, there need not be extra threads at all. For example, take a look at Netty which has built in support for async IO.


Edit

As you've clarified that the operation is a single invocation of readData, and the operation runs for as long as the application is alive, a single dedicated Thread would be a better idea here.

like image 171
Yuval Itzchakov Avatar answered Nov 15 '22 04:11

Yuval Itzchakov