Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 sequential streams increase CPU usage very high

In my spring boot service, I am validating incoming orders based upon order details and customer details.

In customer details, I have different lists of objects like Services, Attributes, Products, etc. and for every list, I am doing something like below:

products.stream()  
       .filter(Objects::nonNull)  
       .map(Product::getResource)  
       .filter(Objects::nonNull)  
       .filter(<SimplePredicate>)  
       .collect(Collectors.toList());  

I am using streams like this many times for products, services & attributes. We observed that performance-wise it is giving very high TPS and memory usage is also very optimal. But this is consuming CPU very much. We are running the service in Kubernetes pods and it is taking 90% of the CPU provided.

One more interesting observation is, the more CPU we give, TPS achieved is higher and CPU usage also reaches 90%.

Is it because Streams consume more CPU? Or is it because of high Garbage Collection because after every iteration of Streams the internal memory might be garbage collected?

EDIT-1:

Upon further investigation using Load Testing, it is observed that:

  • Whenever we increase concurrent threads, due to high CPU usage the service starts not responding and followed by a sudden decrease in CPU and thus resulting in low TPS.
  • Whenever we decrease concurrent threads, CPU usage still remains high but the service is performing in the most optimal way i.e. high TPS.

The following are the statistics of TPS vs. CPU under different CPU/thread configuration.

CPU: 1500m, Threads:70

| TPS | 176  | 140 | 125 | 79 | 63 |
|----------------------------------|
| CPU | 1052 | 405 | 201 | 84 | 13 |  

CPU: 1500m, Threads:35

| TPS | 500 | 510 | 500 | 530 |
|-----------------------------|
| CPU | 1172| 1349| 1310| 1214|  

CPU: 2500m, Threads:70

| TPS |  20 |  20 |  25 |  28 | 26 |
|----------------------------------|
| CPU | 2063| 2429| 2303| 879 | 35 |  

CPU: 2500m, Threads:35

| TPS | 1193 | 1200 | 1200 | 1230 |
|---------------------------------|
| CPU | 600  | 1908 | 2044 | 1949 | 

Tomcat Configuration Used:

server.tomcat.max-connections=100
server.tomcat.max-threads=100
server.tomcat.min-spare-threads=5

EDIT-2:
The thread dump analysis says: 80% of the http-nio threads are in Waiting on condition state. That means all the threads are waiting for something and no one is consuming any CPU that explains low CPU usage. But what could be causing the threads going for waiting? I'm not using any Asynchronous Calls in the service also. Even I'm not using any parallel streams, only sequential streams as mentioned above.

The following is the Thread dump when CPU and TPS go down:

"http-nio-8090-exec-72" #125 daemon prio=5 os_prio=0 tid=0x00007f014001e800 nid=0x8f waiting on condition [0x00007f0158ae1000]
   java.lang.Thread.State: **TIMED_WAITING** (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000000d7470b10> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:89)
    at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:33)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
    - None
like image 484
mukund Avatar asked Oct 21 '20 08:10

mukund


1 Answers

Is it because Streams consume more CPU? Or is it because of high Garbage Collection because after every iteration of Streams the internal memory might be garbage collected?

Clearly streams do consume CPU. And generally speaking, code implemented using non-parallel streams does run a bit slower than code implemented using old-fashioned loops. However, the difference in performance is not huge. (Maybe 5 or 10%?)

In general, a stream does not generate more garbage than an old-fashioned loop performing the same computation. For instance if we compared your example with a loop doing the same thing (i.e. generating a new list), then I would expect there to be a 1-to-1 correspondence between the memory allocations for the two versions.

In short, I don't think streams are directly implicated in this. Obviously, if your service is processing a lot of lists (using streams or loops) for each request, then that is going to affect the TPS. And even more so if the lists are actually fetched from your backend database. But that's normal too. This could be addressed by doing things like request caching, and tweaking the granularity of API requests to compute expensive results that the caller doesn't actually need.

(I would NOT recommend adding parallel() to your streams in your scenario. Since your service are already compute (or swap) bound, there are no "spare" cycles to run the streams in parallel. Using parallel() here is likely to reduce your TPS.)

The second part of your question is about performance (TPS) versus the thread count versus (we think) VCPUs. It is not possible to interpret the results you have given because you don't explain the units of measurements, and .... because I suspect that there other factors in play.

However, as a general rule:

  • Adding more threads when an application is compute intensive doesn't help.
  • More threads means more memory utilization (thread stacks + objects only reachable from thread stacks).
  • More memory utilization means the GC will be less ergonomic.
  • If your JVM is using more virtual memory than you have physical memory, then the OS will typically have to swap pages from RAM to disk and back. This impacts on performance, especially during garbage collection.

It is also possible that there are effects that can be attributed to your cloud platform. For example, if your are running in a virtual server on a compute node with lots of virtual servers, you many not get a full CPU's worth per VCPU. And if your virtual server is generating a lot of swap traffic, that will most likely reduce your server's share of the CPU resources even further.

We cannot say what is actually causing your problem, but if I was in your shoes I would be looking at the Java GC logs, and using OS tools like vmstat and iostat to look for signs of excessive paging and excessive I/O in general.

like image 88
Stephen C Avatar answered Nov 15 '22 01:11

Stephen C