This question is different from this one Difference between Java8 thenCompose and thenComposeAsync because I want to know what is the writer's reason for using thenCompose
and not thenComposeAsync
.
I was reading Modern Java in action and I came across this part of code on page 405:
public static List<String> findPrices(String product) {
ExecutorService executor = Executors.newFixedThreadPool(10);
List<Shop> shops = Arrays.asList(new Shop(), new Shop());
List<CompletableFuture<String>> priceFutures = shops.stream()
.map(shop -> CompletableFuture.supplyAsync(() -> shop.getPrice(product), executor))
.map(future -> future.thenApply(Quote::parse))
.map(future -> future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor)))
.collect(toList());
return priceFutures.stream()
.map(CompletableFuture::join).collect(toList());
}
Everything is Ok and I can understand this code but here is the writer's reason for why he didn't use thenComposeAsync
on page 408 which I can't understand:
In general, a method without the Async suffix in its name executes its task in the same threads the previous task, whereas a method terminating with Async always submits the succeeding task to the thread pool, so each of the tasks can be handled by a different thread. In this case, the result of the second CompletableFuture depends on the first,so it makes no difference to the final result or to its broad-brush timing whether you compose the two CompletableFutures with one or the other variant of this method
In my understanding with the thenCompose
( and thenComposeAsync
) signatures as below:
public <U> CompletableFuture<U> thenCompose(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(null, fn);
}
public <U> CompletableFuture<U> thenComposeAsync(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(asyncPool, fn);
}
The result of the second CompletableFuture
can depends on the previous CompletableFuture
in many situations(or rather I can say almost always), should we use thenCompose
and not thenComposeAsync
in those cases?
What if we have blocking code in the second CompletableFuture
?
This is a similar example which was given by person who answered similar question here: Difference between Java8 thenCompose and thenComposeAsync
public CompletableFuture<String> requestData(Quote quote) {
Request request = blockingRequestForQuote(quote);
return CompletableFuture.supplyAsync(() -> sendRequest(request));
}
To my mind in this situation using thenComposeAsync
can make our program faster because here blockingRequestForQuote
can be run on different thread. But based on the writer's opinion we should not use thenComposeAsync
because it depends on the first CompletableFuture
result(that is Quote).
My question is:
Is the writer's idea correct when he said :
In this case, the result of the second CompletableFuture depends on the first,so it makes no difference to the final result or to its broad-brush timing whether you compose the two CompletableFutures with one or the other variant of this method
TL;DR It is correct to use thenCompose
instead of thenComposeAsync
here, but not for the cited reasons. Generally, the code example should not be used as a template for your own code.
This chapter is a recurring topic on Stackoverflow for reasons we can best describe as “insufficient quality”, to stay polite.
In general, a method without the Async suffix in its name executes its task in the same threads the previous task, …
There is no such guaranty about the executing thread in the specification. The documentation says:
- Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
So there’s also the possibility that the task is performed “by any other caller of a completion method”. An intuitive example is
CompletableFuture<X> f = CompletableFuture.supplyAsync(() -> foo())
.thenApply(f -> f.bar());
There are two threads involved. One that invokes supplyAsync
and thenApply
and the other which will invoke foo()
. If the second completes the invocation of foo()
before the first thread enters the execution of thenApply
, it is possible that the future is already completed.
A future does not remember which thread completed it. Neither does it have some magic ability to tell that thread to perform an action despite it might be busy with something else or even have terminated since then. So it should be obvious that calling thenApply
on an already completed future can’t promise to use the thread that completed it. In most cases, it will perform the action immediately in the thread that calls thenApply
. This is covered by the specification’s wording “any other caller of a completion method”.
But that’s not the end of the story. As this answer explains, when there are more than two threads involved, the action can also get performed by another thread calling an unrelated completion method on the future at the same time. This may happen rarely, but it’s possible in the reference implementation and permitted by the specification.
We can summarize it as: Methods without Async provides the least control over the thread that will perform the action and may even perform it right in the calling thread, leading to synchronous behavior.
So they are best when the executing thread doesn’t matter and you’re not hoping for background thread execution, i.e. for short, non-blocking operations.
whereas a method terminating with Async always submits the succeeding task to the thread pool, so each of the tasks can be handled by a different thread. In this case, the result of the second CompletableFuture depends on the first, …
When you do
future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor))
there are three futures involved, so it’s not exactly clear, which future is meant by “second”. supplyAsync
is submitting an action and returning a future. The submission is contained in a function passed to thenCompose
, which will return another future.
If you used thenComposeAsync
here, you only mandated that the execution of supplyAsync
has to be submitted to the thread pool, instead of performing it directly in the completing thread or “any other caller of a completion method”, e.g. directly in the thread calling thenCompose
.
The reasoning about dependencies makes no sense here. “then” always implies a dependency. If you use thenComposeAsync
here, you enforced the submission of the action to the thread pool, but this submission still won’t happen before the completion of future
. And if future
completed exceptionally, the submission won’t happen at all.
So, is using thenCompose
reasonable here? Yes it is, but not for the reasons given is the quote. As said, using the non-async method implies giving up control over the executing thread and should only be used when the thread doesn’t matter, most notably for short, non-blocking actions. Calling supplyAsync
is a cheap action that will submit the actual action to the thread pool on its own, so it’s ok to perform it in whatever thread is free to do it.
However, it’s an unnecessary complication. You can achieve the same using
future.thenApplyAsync(quote -> Discount.applyDiscount(quote), executor)
which will do exactly the same, submit applyDiscount
to executor
when future
has been completed and produce a new future representing the result. Using a combination of thenCompose
and supplyAsync
is unnecessary here.
Note that this very example has been discussed in this Q&A already, which also addresses the unnecessary segregation of the future operations over multiple Stream
operations as well as the wrong sequence diagram.
What a polite answer from Holger! I am really impressed he could provide such a great explanation and at the same time staying in bounds of not calling the author plain wrong. I want to provide my 0.02$ here too, a little, after reading the same book and having to scratch my head twice.
First of all, there is no "remembering" of which thread executed which stage, neither does the specification make such a statement (as already answered above). The interesting part is even in the cited above documentation:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Even that ...completes the current CompletableFuture part is tricky. What if there are two threads that try to call complete
on a CompletableFuture
, which thread will run all the dependent actions? The one that has actually completed it? Or any other? I wrote a jcstress test that is very non-intuitive when looking at the results:
@JCStressTest
@State
@Outcome(id = "1, 0", expect = Expect.ACCEPTABLE, desc = "executed in completion thread")
@Outcome(id = "0, 1", expect = Expect.ACCEPTABLE, desc = "executed in the other thread")
@Outcome(id = "0, 0", expect = Expect.FORBIDDEN)
@Outcome(id = "1, 1", expect = Expect.FORBIDDEN)
public class CompletableFutureWhichThread1 {
private final CompletableFuture<String> future = new CompletableFuture<>();
public CompletableFutureWhichThread1() {
future.thenApply(x -> action(Thread.currentThread().getName()));
}
volatile int x = -1; // different default to not mess with the expected result
volatile int y = -1; // different default to not mess with the expected result
volatile int actor1 = 0;
volatile int actor2 = 0;
private String action(String threadName) {
System.out.println(Thread.currentThread().getName());
// same thread that completed future, executed action
if ("actor1".equals(threadName) && actor1 == 1) {
x = 1;
return "action";
}
// same thread that completed future, executed action
if ("actor2".equals(threadName) && actor2 == 1) {
x = 1;
return "action";
}
y = 1;
return "action";
}
@Actor
public void actor1() {
Thread.currentThread().setName("actor1");
boolean completed = future.complete("done-actor1");
if (completed) {
actor1 = 1;
} else {
actor2 = 1;
}
}
@Actor
public void actor2() {
Thread.currentThread().setName("actor2");
boolean completed = future.complete("done-actor2");
if (completed) {
actor2 = 1;
}
}
@Arbiter
public void arbiter(II_Result result) {
if (x == 1) {
result.r1 = 1;
}
if (y == 1) {
result.r2 = 1;
}
}
}
After running this, both 0, 1
and 1, 0
are seen. You do not need to understand very much about the test itself, but it proves a rather interesting point.
You have a CompletableFuture future
that has a future.thenApply(x -> action(...));
attached to it. There are two threads (actor1
and actor2
) that both, at the same time, compete with each other into completing it (the specification says that only one will be successful). The results show that if actor1
called complete
, but does not actually complete the CompletableFuture
(actor2
did), it can still do the actual work in action
. In other words, a thread that completed a CompletableFuture
is not necessarily the thread that executes the dependent actions (those thenApply
for example). This was rather interesting for me to find out, though it makes sense.
Your reasonings about speed are a bit off. When you dispatch your work to a different thread, you usually pay a penalty for that. thenCompose
vs thenComposeAsync
is about being able to predict where exactly is your work going to happen. As you have seen above you can not do that, unless you use the ...Async
methods that take a thread pool. Your natural question should be : "Why do I care where it is executed?".
There is an internal class in jdk's
HttpClient
called SelectorManager
. It has (from a high level) a rather simple task: it reads from a socket and gives "responses" back to the threads that wait for a http result. In essence, this is a thread that wakes up all interested parties that wait for some http packets. Now imagine that this particular thread does internally thenCompose
. Now also imagine that your chain of calls looks like this:
httpClient.sendAsync(() -> ...)
.thenApply(x -> foo())
where foo
is a method that never finishes (or takes a lot of time to finish). Since you have no idea in which thread the actual execution is going to happen, it can, very well, happen in SelectorManager
thread. Which would be a disaster. Everyone other http calls would stale, because this thread is busy now. Thus thenComposeAsync
: let the configured pool do the work/waiting if needed, while the SelectorManager
thread is free to do its work.
So the reasons that the author gives are plain wrong.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With