Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does park/unpark have 60% CPU usage?

Recently we started stress testing our application (an XMPP based chat server) using YJP 11.0.9. During our test we noticed following strange behavior.

  1. Sampling shows sun.misc.Unsafe.unpark(Object) took 60% of CPU.
  2. For the same app Tracing shows that LockSupport.park(Object) took 52% of CPU.

I did multiple tests to confirm results and every time I got similar results.

I am unable to understand why unpark should take 60% time and why tracing shows exactly opposite results.

Can someone help me understand these results. Am I missing something here?

Environment:

java -version
java version "1.6.0_31"
Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
like image 289
Kalpesh Patel Avatar asked Oct 19 '12 11:10

Kalpesh Patel


3 Answers

High CPU time of Unsafe.unpark is a sign of excessive Context Switching. Here is an article to get the idea of how expensive is a context switch:

How long does it take to make a context switch?

The easiest option to find out CS count on Linux is run vmstat <seconds>.

Once high CS is confirmed (e.g. more 10K switches per core/per second) you take the offending thread (you can sort threads in YJP by their CPU time) and run strace -p <pid> -c to find out the cause of switching, e.g. thread is reading from socket in small pieces and get's switched off, in which case increasing socket buffer might help.

like image 95
Ivan Balashov Avatar answered Nov 13 '22 19:11

Ivan Balashov


With certain low level blocking commands like read/write/park/lock the "CPU" time is over estimated as it assumes it is consuming CPU when actually the operation is blocking. The fact unpark/park are both high does suggest you have a problem, but I suspect you should take the lower of the two percentages as an estimate.

like image 3
Peter Lawrey Avatar answered Nov 13 '22 21:11

Peter Lawrey


I learned the hard way that you can also have high CPU usage % on LockSupport.parkNanos() if you call multiple times the method and your sleep argument is in the order of nanoseconds:

while(true){
  // code logic
  LockSupport.parkNanos(100);
}

Simply changing to microseconds helped me resolve my issue. I also tried it on my local machine and was able to reproduce the high CPU usage %, and saw it drastically improve after going for microseconds.

like image 1
Chiquelo Avatar answered Nov 13 '22 20:11

Chiquelo