Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

G1 GC - Large background I/O causing JVM unresponsive - a 8sec pause

We have a java application which need almost Realtime response. But we are also seeing pauses up to 8sec.

Special running condition:

  1. In some intervals the application would serialize a huge data snapshot of size up to 1.5G in to the disk (SSD).
  2. Occasionally some heavy background I/O happens by some other coexisting process in the m/c.

What we are ovserving, DURING the interval when this large serialization write happens, if by chance the GC kicks in, it is causing a huge 6 to 8 sec pause, the JVM is in completely unresponsive/frozen state.

From the JFR recording it shows that,

  1. ALL the threads are is in parked/wait/sleep states, except the one which is writing to the disk.
  2. The G1 GC takes about 200ms to finish GC.

From the GC Log: 2018-09-05T18:23:40.277+0000: 39892.345: Total time for which application threads were stopped: 8.3785112 seconds, Stopping threads took: 8.3765855 seconds

Java Version: java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b25) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b25, mixed mode)

Questions:

  1. Why such 8sec non-GC JVM pause, when GC and background I/O happens at the same time?
  2. How to overcome this large pause?

JFR file: https://www.filehosting.org/file/details/756217/Run.jfr

like image 744
Dhiman Ghosh Avatar asked Sep 12 '18 12:09

Dhiman Ghosh


1 Answers

Your problem is not in GC, but JVM Stop-The-World mechanics - safepoints. JVM wait for all threads to park on safepoint before starting GC related work.

If your Java code is using memory mapped files, you may need to replace it with regular IO. Memory mapping (especially write heavy) plays badly with safepoint mechanics. Thread accessing memory mapped file may be blocked by OS waiting for writing/reading memory page. If GC is triggered at such moment, it will have to wait IO blocked thread to resume and reach next safepoint check.

UPDATE: To put is simple, if Java thread is blocked on RandomFile/Channel read method it doesn't prevent JVM safepoint. But is Java thread is blocked on read/write operation over memory mapped file, JVM cannot enter safepoint until thread is unblocked. If such access is wrapped in loop it may even wait until loop is finished under certain conditions.

Another problem causing long "Stopping threads took" are could be loops. If you have simple loop with int counter, JVM consider it "fast" and may omit safepoint checks within loop body.

like image 185
Alexey Ragozin Avatar answered Nov 01 '22 22:11

Alexey Ragozin