Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What caused socket connections to be slow after Full GC?

We have a client server app, 1 server, about 10 clients. They communicate via tcp sockets using custom queries.

The system had been running smooth for many months, but at some point, after the daily scheduled server FULL GC that took about 50s, we figured out that the time between the queries sent by the client and responses received from the server was large, > 10-20s. After some 3 hours the system recovered, everything was running fine again.

While investigating the issue, we found:

  1. No garbage collection problems on both clients and server
  2. Query processing time on server was small.
  3. Load on server was high.
  4. The network bandwidth was not saturated.
  5. Connections were not reset during the FULL GC (the daily FULL GC was a normal event until then)
  6. The machine and OS changed recently from Centos 6 (kernel 2.6.32) to Centos 7 (kernel 3.10.0), but the new config was extensivelly tested. Also Oracle JDK version changed from 1.7.65 to 1.7.75.

We took a thread dump on the server:

java.lang.Thread.State: RUNNABLE
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:293)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

The FilterInputStream.read() is the following:

    public int read() throws IOException {
    return in.read();
}

The in in our code is a BufferedInputStream.

The questions are: Why most connections slowed after the Full GC pause? Why the stacktrace ends in FilterInputStream.read()? Shouldn't it end somewhere in the BufferedInputStream or in socket input stream? Can this read lead to high load on server?

The code we use for reading:

int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
      constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();

where:

_socketDIS = new DataInputStream(new BufferedInputStream(_socket.getInputStream()));

Here is the stacktrace from the well working client connections:

java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
    - locked <0x00007f522cbebca8> (a java.io.BufferedInputStream)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:287)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

UPDATE:

Regarding the EJP answer:

  1. There was no EOS involved, the connections were up, but they were very slow

  2. Even if there was an EOS I can't see how the code could spin at the EOS, the for is bounded by the constructLength value. But still, the suggested improvement is valid.

  3. The stacktrace with the problem ends in a read done on a DataInputStream ((_socketDIS.read()) that is inherited from FilterInputStream.read(), see code above. DataInputStream, not BufferedInputStream is missing the read(). Here in FilterInputStream.read() there is a in.read() called on a BufferedInputStream, this one has its own read() method defined. But the stacktrace stops in the middle, is not reaching BufferedInputStream.read(). Why?

like image 554
dcernahoschi Avatar asked May 20 '15 10:05

dcernahoschi


People also ask

What happens full GC?

Full GC is an important event in the garbage collection process. During this full GC phase, garbage is collected from all the regions in the JVM heap (Young, Old, Perm, Metaspace). Full GC tends to evict more objects from memory, as it runs across all generations.

How long should a full GC take?

Depending on the time of day, full GC happens as often as every 5 minutes when busy, or up to 30 minutes can go by in between full GCs during the slow periods.

Why does Major GC take so long?

Large Heap size Large heap size (-Xmx) can also cause long GC pauses. If heap size is quite high, then more garbage will be get accumulated in the heap. When Full GC is triggered to evict the all the accumulated garbage in the heap, it will take long time to complete.

How many connections can Netty handle?

Your Netty or Play app should now be able to handle over 1000 concurrent connections (or more, depending on what limits you set above).


1 Answers

Reading one byte at a time is wasteful of CPU. Throw this away:

int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
      constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();

and use this:

int constructLength = _socketDIS.readInt();
byte[] bytes = new byte[constructLength];
_socketDIS.readFully(bytes);

NB _socketDIS is clearly not a BufferedInputStream but a DataInputStream, which is unbuffered.

EDIT

Why the stacktrace ends in FilterInputStream.read()?

Look carefully. BufferedInputStream doesn't implement all three read() overloads. One of them, I forget which, is implemented down in FilterInputStream, the base class, and the other two overloads call that.

Shouldn't it end somewhere in the BufferedInputStream

No, see above.

or in socket input stream?

Yes, if it was blocking, but it isn't, probably because you're spinning at end of stream, because of your poor code.

Can this read lead to high load on server?

Yes.

like image 74
user207421 Avatar answered Dec 10 '22 09:12

user207421