Here we have SeqPar
object which contains a task
routine which is a simple mock Future
which prints out some debugging info and returns Future[Int]
type.
The question is: why experiment1
is allowed to run in parallel, while experiment2
always runs sequentially?
object SeqPar {
def experiment1: Int = {
val f1 = task(1)
val f2 = task(2)
val f3 = task(3)
val computation = for {
r1 <- f1
r2 <- f2
r3 <- f3
} yield (r1 + r2 + r3)
Await.result(computation, Duration.Inf)
}
def experiment2: Int = {
val computation = for {
r1 <- task(1)
r2 <- task(2)
r3 <- task(3)
} yield (r1 + r2 + r3)
Await.result(computation, Duration.Inf)
}
def task(i: Int): Future[Int] = {
Future {
println(s"task=$i thread=${Thread.currentThread().getId} time=${System.currentTimeMillis()}")
i * i
}
}
}
When I run experiment1
it prints out:
task=3 thread=24 time=1541326607613
task=1 thread=22 time=1541326607613
task=2 thread=21 time=1541326607613
While experiment2
:
task=1 thread=21 time=1541326610653
task=2 thread=20 time=1541326610653
task=3 thread=21 time=1541326610654
What is the reason for the observed difference? I do know that for
comprehension desugared like f1.flatMap(r1 => f2.flatMap(r2 => f3.map(r3 => r1 + r2 + r3)))
but I still missing a point why one is allowed to run in parallel and another isn't.
This is an effect of what Future(…)
and flatMap
do:
val future = Future(task)
starts running task in parallelfuture.flatMap(result => task)
arranges for running task
when future
completesNote that future.flatMap(result => task)
cannot start running task in parallel before future
completes because to run task
, we need result
, which is only available when future
completes.
Now lets look at your example1
:
def experiment1: Int = {
// construct three independent tasks and start running them
val f1 = task(1)
val f2 = task(2)
val f3 = task(3)
// construct one complicated task that is ...
val computation =
// ... waiting for f1 and then ...
f1.flatMap(r1 =>
// ... waiting for f2 and then ...
f2.flatMap(r2 =>
// ... waiting for f3 and then ...
f3.map(r3 =>
// ... adding some numbers.
r1 + r2 + r3)))
// now actually trigger all the waiting
Await.result(computation, Duration.Inf)
}
So in example1
, since all three tasks take the same time and were started at the same time, we probably only have to block when waiting for f1
. When we get around to wait for the f2
, its result should already been there.
Now how does example2
differ?
def experiment2: Int = {
// construct one complicated task that is ...
val computation =
// ... starting task1 and then waiting for it and then ...
task(1).flatMap(r1 =>
// ... starting task2 and then waiting for it and then ...
task(2).flatMap(r2 =>
// ... starting task3 and then waiting for it and then ...
task(3).map(r3 =>
// ... adding some numbers.
r1 + r2 + r3)))
// now actually trigger all the waiting and the starting of tasks
Await.result(computation, Duration.Inf)
}
In this example, we are not even constructing task(2)
before we have waited for task(1)
to finish, so the tasks cannot run in parallel.
So when programming with Scala's Future
, you have to control your concurrency by choosing correctly between code like example1
and code like example2
. Or you can look into libraries that provide more explicit control over concurrency.
This is because Scala Futures are strict. The operation inside a Future is executed as soon as the Future is created and then it memoizes its value. So you are losing referential transparency. In your case your futures are executed in your first task call, the result is memoized. They are not executed again inside the for. In the second case futures are created in your for comprehension and the result is correct.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With