We have a client server app, 1 server, about 10 clients. They communicate via tcp sockets using custom queries.
The system had been running smooth for many months, but at some point, after the daily scheduled server FULL GC that took about 50s, we figured out that the time between the queries sent by the client and responses received from the server was large, > 10-20s. After some 3 hours the system recovered, everything was running fine again.
While investigating the issue, we found:
We took a thread dump on the server:
java.lang.Thread.State: RUNNABLE
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:293)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
The FilterInputStream.read()
is the following:
public int read() throws IOException {
return in.read();
}
The in
in our code is a BufferedInputStream
.
The questions are: Why most connections slowed after the Full GC pause? Why the stacktrace ends in FilterInputStream.read()
? Shouldn't it end somewhere in the BufferedInputStream
or in socket input stream? Can this read lead to high load on server?
The code we use for reading:
int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();
where:
_socketDIS = new DataInputStream(new BufferedInputStream(_socket.getInputStream()));
Here is the stacktrace from the well working client connections:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
- locked <0x00007f522cbebca8> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:287)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
UPDATE:
Regarding the EJP answer:
There was no EOS involved, the connections were up, but they were very slow
Even if there was an EOS I can't see how the code could spin at the EOS, the for
is bounded by the constructLength
value. But still, the suggested improvement is valid.
The stacktrace with the problem ends in a read done on a DataInputStream
((_socketDIS.read()
) that is inherited from FilterInputStream.read()
, see code above. DataInputStream
, not BufferedInputStream
is missing the read()
.
Here in FilterInputStream.read()
there is a in.read()
called on a BufferedInputStream
, this one has its own read()
method defined. But the stacktrace stops in the middle, is not reaching BufferedInputStream.read()
. Why?
Full GC is an important event in the garbage collection process. During this full GC phase, garbage is collected from all the regions in the JVM heap (Young, Old, Perm, Metaspace). Full GC tends to evict more objects from memory, as it runs across all generations.
Depending on the time of day, full GC happens as often as every 5 minutes when busy, or up to 30 minutes can go by in between full GCs during the slow periods.
Large Heap size Large heap size (-Xmx) can also cause long GC pauses. If heap size is quite high, then more garbage will be get accumulated in the heap. When Full GC is triggered to evict the all the accumulated garbage in the heap, it will take long time to complete.
Your Netty or Play app should now be able to handle over 1000 concurrent connections (or more, depending on what limits you set above).
Reading one byte at a time is wasteful of CPU. Throw this away:
int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();
and use this:
int constructLength = _socketDIS.readInt();
byte[] bytes = new byte[constructLength];
_socketDIS.readFully(bytes);
NB _socketDIS
is clearly not a BufferedInputStream
but a DataInputStream,
which is unbuffered.
EDIT
Why the stacktrace ends in FilterInputStream.read()?
Look carefully. BufferedInputStream
doesn't implement all three read() overloads. One of them, I forget which, is implemented down in FilterInputStream, the base class, and the other two overloads call that.
Shouldn't it end somewhere in the BufferedInputStream
No, see above.
or in socket input stream?
Yes, if it was blocking, but it isn't, probably because you're spinning at end of stream, because of your poor code.
Can this read lead to high load on server?
Yes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With