Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why there is performance degradation after ~6 hours of Java 9 G1 work without the actual increase in load?

I switched 1 instance (2 vCPU, 2GB RAM, load ~4k req/sec) to Java 9 (from latest Java 8). For a while, everything was fine and CPU usage was same as before. However, after ~6 hours CPU consumption increased by 4% (from 21% to 25%) for no reason. I had no traffic spikes, no memory consumption increased, no metric changes (I have counters for every method within code). Nothing.

I left this instance untouched for ~12 hours expecting it will revert back. But nothing changed. It just started consuming more CPU.

top command showed that the instance had more CPU spikes than usually for the Java server process. I read recently that G1 is not suitable for the high throughput. So I made a conclusion that reason could be in G1.

I restarted instance with:

java -XX:+UseParallelGC -jar server-0.28.0.jar 

And after ~20 hours of the monitoring, everything is fine as before. CPU consumption is on the level of 21% as it was many days before.

CPU usage right after Java 9 deployment (6h scale):

enter image description here

CPU increase after 7 hours + 12 hours "untouched" (7d scale):

enter image description here

CPU after -XX:+UseParallelGC (24h scale):

enter image description here

So my question is - is that expected behavior for the G1? Anyone else sees something similar?

Ubuntu 16.04 x64

java version "9" Java(TM) SE Runtime Environment (build 9+181) Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode) 

EDIT 03.01.2019

Tried to run one the same server with G1 on the java 10.0.2:

java version "10.0.2" 2018-07-17 Java(TM) SE Runtime Environment 18.3 (build 10.0.2+13) Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode) 

G1 consumes 40% more CPU than UseParallelGC right after the server restart.

like image 271
Dmitriy Dumanskiy Avatar asked Oct 05 '17 09:10

Dmitriy Dumanskiy


1 Answers

(Note that GC tuning is extremely dependant on the environment, so there is no magic recipe.)

Had a very similar issue with G1. By default, it seems to be rather badly suited for REST endpoints (again, this is only what I experienced in my direct neighborhood). What helped me was experimenting with the GC flags, as described here.

For us, the biggest improvements came from -XX:G1NewSizePercent=25 and -XX:MaxGCPauseMillis=50. G1 is also auto-tuning itself over time, so the max. GC pause limit has a significant effect on all other parameters.

like image 136
Agoston Horvath Avatar answered Sep 22 '22 20:09

Agoston Horvath