Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Glassfish 4 Grizzly Threads Heavy CPU usage

I have a Jersey application running on Glassfish 4 (4.1 build 13), JDK 1.7 update 67 and AWS Linux AMI and I'm noticing that after some hours running it, CPU usage goes up and stays up even though clients are stopped.

Running "top -H" identifies 2 http-listener-1-kernel threads with high CPU usage (from a total of 16). I then took a thread dump to check these two threads:

"http-listener-1-kernel(3) SelectorRunner" daemon prio=10 tid=0x00007fbc68251000 nid=0xaee runnable [0x00007fbcb55ce000]
    java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x000000060263ad88> (a sun.nio.ch.Util$2)
        - locked <0x000000060263ad78> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000006025fb068> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:106)
        at org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:114)
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:338)
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:278)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
        at java.lang.Thread.run(Thread.java:745)


"http-listener-1-kernel(8) SelectorRunner" daemon prio=10 tid=0x00007fbc6825b800 nid=0xaf3 runnable [0x00007fbcb50c9000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000006026648b8> (a sun.nio.ch.Util$2)
        - locked <0x00000006026648a8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000602664790> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:106)
        at org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:114)
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:338)
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:278)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
        at java.lang.Thread.run(Thread.java:745)

Interestingly, the other threads that do not consume CPU are using a different method at sun.nio.ch.SelectorImpl (select instead of selectNow):

"http-listener-1-kernel(7) SelectorRunner" daemon prio=10 tid=0x00007fbc68259800 nid=0xaf2 runnable [0x00007fbcb51ca000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x0000000602665f20> (a sun.nio.ch.Util$2)
        - locked <0x0000000602665f10> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000602665df8> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:112)
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:338)
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:278)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
        at java.lang.Thread.run(Thread.java:745)

I then turned clients back on and left them running for a few more hours, before stopping them again. Sure enough, "top -H" shows that CPU usage increased again. Another thread dump reveals that one more thread switched to "selectNow" and is now taking more CPU time (along with the other 2):

"http-listener-1-kernel(6) SelectorRunner" daemon prio=10 tid=0x00007fbc68257800 nid=0xaf1 runnable [0x00007fbcb52cb000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x00000006026675c0> (a sun.nio.ch.Util$2)
        - locked <0x00000006026675b0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000602667498> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.selectNow(SelectorImpl.java:106)
        at org.glassfish.grizzly.nio.DefaultSelectorHandler.select(DefaultSelectorHandler.java:114)
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:338)
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:278)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:565)
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:545)
        at java.lang.Thread.run(Thread.java:745)

It appears that something causes the threads to call the "selectNow" method of sun.nio.ch.SelectorImpl and once they enter there, CPU usage increases heavily and does not decrease unless server is restarted.

Is this a known issue? Can this be somehow caused by my code?

Thanks for the help!

like image 261
user3112808 Avatar asked Nov 23 '22 12:11

user3112808


1 Answers

Not sure if it's still relevant after such a long time, but this could also be similar to the issue described here:

We recently experienced a recurring issue with some of our JVMs that use Grizzly. Occasionally one thread in a JVM would spike to 100% CPU usage and stay there until we killed and restarted it.

Grizzly added a config option to activate a workaround for that issue in GRIZZLY-1801 to

"Introduce system property to force NIO selector spin detection"

Try setting the JVM option

-Dorg.glassfish.grizzly.nio.DefaultSelectorHandler.force-selector-spin-detection=true

to activate the workaround.

I could not exactly pin down from which Grizzly version(s) on this option is available, but if you have Grizzly 2.2.23+ it should work.

like image 80
Kosi2801 Avatar answered Dec 15 '22 01:12

Kosi2801