I work on a Jetty web app that was running on Java 16. I tried to upgrade it to Java 17 but there were critical performance issues caused entirely by one call to parallelStream()
.
The only changes are the Java version bump from 16 to 17, --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED
and the runtime bump from openjdk:16.0.1-jdk-oraclelinux8
to openjdk:17.0.1-jdk-oraclelinux8
.
We managed to obtain a thread dump and it contains many of these:
"qtp1368594774-200" #200 prio=5 os_prio=0 cpu=475.94ms elapsed=7189.65s tid=0x00007fd49c50cc10 nid=0xd1 waiting on condition [0x00007fd48fef7000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x00000007b73439a8> (a java.util.stream.ReduceOps$ReduceTask)
at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:341)
at java.util.concurrent.ForkJoinTask.awaitDone([email protected]/ForkJoinTask.java:468)
at java.util.concurrent.ForkJoinTask.invoke([email protected]/ForkJoinTask.java:687)
at java.util.stream.ReduceOps$ReduceOp.evaluateParallel([email protected]/ReduceOps.java:927)
at java.util.stream.AbstractPipeline.evaluate([email protected]/AbstractPipeline.java:233)
at java.util.stream.ReferencePipeline.collect([email protected]/ReferencePipeline.java:682)
at com.stackoverflowexample.aMethodThatDoesBlockingIOUsingParallelStream()
The code that is causing the issue is something like:
list.parallelStream()
.map(this::callRestServiceToGetSomeData)
.collect(Collectors.toUnmodifiableList());
This image shows thread use before upgrading from jdk16 (LHS), upgrading to jdk17 (the huge spike in the middle), then removing the call to parallelStream()
still on jdk17 (RHS):
What change in Java 17 (openjdk-17.0.1_linux-x64_bin.tar.gz) has caused this?
We all know or be told that creating a new thread is a heavy operation. But it seems okay to me after ran a few tests. For example: Here is the memory usage by run below simple test with 10_000 thread. It took about 2 or 3 seconds on my laptop and jvm usage is about 1.5 G.
final int threadNum = 10_000;
final Callable<String> task = () -> {
String bigString = UUID.randomUUID().toString().repeat(1000);
assertTrue(bigString.chars().sum() > 0);
Thread.currentThread().sleep(1000);
return bigString;
};
final ExecutorService executorService = Executors.newFixedThreadPool(threadNum);
final List<Future<String>> futures = new ArrayList<>(threadNum);
for (int i = 0; i < threadNum; i++) {
futures.add(executorService.submit(task));
}
long ret = futures.stream().map(Fn.futureGet()).mapToInt(String::length).sum();
System.out.println(ret);
assertEquals(UUID.randomUUID().toString().length() * threadNum * 1000, ret);
I think it's a rare chance that 10_000 will be created/used in most of applications. If I changed the thread number to 1000. Again it took 2 or 3 seconds and memory usage is about: 300 MB.
Is it possible or a good idea to use stream api to run blocking I/O call in parallel? I think so. Here is a sample with my tool: abacus-common
// Run above task by Stream.
ret = IntStreamEx.range(0, threadNum)
.parallel(threadNum)
.mapToObj(it -> Try.call(task))
.sequential()
.mapToInt(String::length)
.sum();
// Or other task
StreamEx.of(list)
.parallel(64) // Specify the concurrent thread number. It could be from 1 up to thousands.
.map(this::callRestServiceToGetSomeData)
.collect(Collectors.toUnmodifiableList());
Or use Virtual Threads introduced in Java 19+
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
StreamEx.of(list)
.parallel(executor)
.map(this::callRestServiceToGetSomeData)
.collect(Collectors.toUnmodifiableList());
}
I know this is not a direct answer to the question. But it may resolve the original problem which brought up this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With