We are seeing this intermittent issue in production. The CPU gets pegged at 50% (2 core CPU) randomly and it never comes back. Only option is to restart the server. This is how CPU appears from Dynatrace
This is how the thread dump looks when we analyzed through dynatrace.
Through my research, it appears there was a jdk defect
Calling 'java.util.zip.Deflater.finish()' prematurely hangs the application.
The application is spinning consuming one cpu
https://bugs.openjdk.java.net/browse/JDK-8060193
Only happens randomly when for some multiple filters are involved.
I was able to reproduce this using test class in above jira on CentOs vm which has JDK "1.8.0_201" That was surprising because as per the docs and ticket, this has been fixed.
On further research, find similar defect opened again in jdk.
https://bugs.openjdk.java.net/browse/JDK-8193682
Now the team is not willing to work on it unless someone could reproduce it. Since it is happening randomly in production, I am not sure how to reproduce it. The test class from https://bugs.openjdk.java.net/browse/JDK-8060193 still has issues. IS this even a valid test case? If this is valid then there will be problems every time we send compressed data.
Any pointers as to why is this happening and how we can solve this?
Update: In one of the libraries we are using, it was throwing an exception Malformed UTF-8 character (unexpected non-continuation byte 0x00, immediately after start byte 0xfd)
LastName, First’Name As we can see, this is not a regular apostrophe.We can have this by copy pasting from word which auto corrects a regular apostrophe to this funky character.
Our reproducer did threw an error but CPU was not getting stuck. I think it happens under high volume and traffic.
As I said in a comment before, we are facing this problem when we try to generate Zip files which are being written in the OutputStream
of the HttpServletResponse
through a ZipOutputStream
.
The reason for the cores running at 100% is because of three (under certain conditions)infinite loops in ZipOutputStream(closeEntry()
) and DeflaterOutputStream(write()
and finish()
).
These infinite loops look like this:
while (!def.finished()) {
deflate();
}
Where def
is a java.util.zip.Deflater
.
If I understand right, this is the problem in JDK-8193682. There is a workaround class there which overwrites the deflate
method of ZipOutputStream
.
I am going to try to use a class based on that workaround, which accepts a timeout to be checked in the deflate
method. I hope not to produce resource leaks with this approach.
Related question: Thread locking when flushing jsp file
I want to post an update to this problem that has bugged us for years. We had an inititiave to migrate static content to CDN underway. After CDN was implemented and all static resources was served from a different server, the ZipStream problem was resolved. Although the research showed that the problem was more for dynamic content and not static, I am not sure how the problem got solved. Maybe someone who is reading this answer can explain me how this has got fixed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With