Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CPU gets pegged - Problem with java.util.zip.ZStreamRef

Tags:

java

cpu

We are seeing this intermittent issue in production. The CPU gets pegged at 50% (2 core CPU) randomly and it never comes back. Only option is to restart the server. This is how CPU appears from Dynatrace

enter image description here This is how the thread dump looks when we analyzed through dynatrace.

enter image description here

Apache tomcat thread stuck

Through my research, it appears there was a jdk defect

Calling 'java.util.zip.Deflater.finish()' prematurely hangs the application. 
The application is spinning consuming one cpu

https://bugs.openjdk.java.net/browse/JDK-8060193

Only happens randomly when for some multiple filters are involved.

I was able to reproduce this using test class in above jira on CentOs vm which has JDK "1.8.0_201" That was surprising because as per the docs and ticket, this has been fixed.

On further research, find similar defect opened again in jdk.

https://bugs.openjdk.java.net/browse/JDK-8193682

Now the team is not willing to work on it unless someone could reproduce it. Since it is happening randomly in production, I am not sure how to reproduce it. The test class from https://bugs.openjdk.java.net/browse/JDK-8060193 still has issues. IS this even a valid test case? If this is valid then there will be problems every time we send compressed data.

  • Our run time JRE is Jdk 1.8
  • Compression is at tomcat, not at load balancer.

Any pointers as to why is this happening and how we can solve this?

Update: In one of the libraries we are using, it was throwing an exception Malformed UTF-8 character (unexpected non-continuation byte 0x00, immediately after start byte 0xfd)

LastName, First’Name As we can see, this is not a regular apostrophe.We can have this by copy pasting from word which auto corrects a regular apostrophe to this funky character.

Our reproducer did threw an error but CPU was not getting stuck. I think it happens under high volume and traffic.

like image 256
vsingh Avatar asked May 15 '19 17:05

vsingh


2 Answers

As I said in a comment before, we are facing this problem when we try to generate Zip files which are being written in the OutputStream of the HttpServletResponse through a ZipOutputStream.

The reason for the cores running at 100% is because of three (under certain conditions)infinite loops in ZipOutputStream(closeEntry()) and DeflaterOutputStream(write() and finish()). These infinite loops look like this:

while (!def.finished()) {
    deflate();
}

Where def is a java.util.zip.Deflater.

If I understand right, this is the problem in JDK-8193682. There is a workaround class there which overwrites the deflate method of ZipOutputStream.

I am going to try to use a class based on that workaround, which accepts a timeout to be checked in the deflate method. I hope not to produce resource leaks with this approach.

Related question: Thread locking when flushing jsp file

like image 184
AdrianRM Avatar answered Nov 10 '22 18:11

AdrianRM


I want to post an update to this problem that has bugged us for years. We had an inititiave to migrate static content to CDN underway. After CDN was implemented and all static resources was served from a different server, the ZipStream problem was resolved. Although the research showed that the problem was more for dynamic content and not static, I am not sure how the problem got solved. Maybe someone who is reading this answer can explain me how this has got fixed.

like image 30
vsingh Avatar answered Nov 10 '22 17:11

vsingh