Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java xpath memory leak?

I have a situation that's been torturing me for months: I keep getting OOM exceptions (Heap Space) and on inspecting heap dumps I've found millions of instances of objects I never allocated but that were likely allocated in underlying libraries. After much blood, sweat and tears I have managed to localize the code generating the memory leak and I have composed a minimal, complete and verifiable code sample to illustrate this:

import java.util.logging.Level;
import java.util.logging.Logger;
import javafx.application.Application;
import javafx.beans.value.ChangeListener;
import javafx.beans.value.ObservableValue;
import javafx.concurrent.Worker;
import javafx.scene.web.WebEngine;
import javafx.stage.Stage;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class MVC extends Application implements ChangeListener<Worker.State>{

    private final WebEngine engine = new WebEngine();
    private final String url = "https://biblio.ugent.be/publication?sort=publicationstatus.desc&sort=year.desc&limit=250&start=197000";
    private final XPath x = XPathFactory.newInstance().newXPath();

    @Override
    public void start(Stage primaryStage) throws Exception {
        System.setProperty("jsse.enableSNIExtension", "false");
        engine.getLoadWorker().stateProperty().addListener(this);
        engine.load(url);
    }

    public static void main(String[] args) {
        launch(args);
    }

    private NodeList eval(Node context, String xpath) throws XPathExpressionException{
        return (NodeList)x.evaluate(xpath, context, XPathConstants.NODESET);
    }

    @Override
    public void changed(ObservableValue<? extends Worker.State> observable, Worker.State oldValue, Worker.State newValue) {
        if (newValue==Worker.State.SUCCEEDED) {
            try {
                while(true){
                    NodeList eval = eval(engine.getDocument(), "//span[@class='title']");
                    int s = eval.getLength();
                }
            } catch (XPathExpressionException ex) {
                Logger.getLogger(MVC.class.getName()).log(Level.SEVERE, null, ex);
            }
        }
    }
}

The code does the following:

  • load a document using the JavaFX WebEngine.
  • endlessly perform an xpath query on the document using the javax.xml packages, without storing the result or pointers to it.

To run, create a JavaFX application, add a file named MVC.java in the default package, enter the code and hit run. Any profiling tool (I use VisualVM) should quickly show you that in a matter of minutes, the heap grows uncontrollably. The following objects seem to be allocated but never released:

  • java.util.HashMap$Node
  • com.sun.webkit.Disposer$WeakDisposerRecord
  • com.sun.webkit.dom.NamedNodeMapImpl$SelfDisposer
  • java.util.concurrent.LinkedBlockingQueue$Node

This behavior happens every time I run the code, regardless of the url I load or the xpath I execute on the document.

Setup with which I tested:

  • MBP running OS X Yosemite (up-to-date)
  • JDK 1.8.0_60

Can anyone reproduce this issue? Is it an actual memory leak? Is there anything I can do?

edit

A colleague of mine reproduced the problem on a w7 machine with JDK 1.8.0_45, and it happens on an Ubuntu server as well.

edit 2

I've tested jaxen as an alternative to the javax.xml package, but the results are the same, which leads me to believe the bug lies deep within the sun webkit

like image 218
RDM Avatar asked Sep 17 '15 15:09

RDM


People also ask

Can Java expose a memory leak?

A small Java application might have a memory leak, but it will not matter if the JVM has enough memory to run your program. However, if your Java application runs constantly, then memory leaks will be a problem. This is because a continuously running program will eventually run out of memory resources.

Which of the following actions can cause memory leak?

Common causes for these memory leaks are: Excessive session objects. Insertion without deletion into Collection objects. Unbounded caches.

How does Dynatrace detect memory leaks?

Identifying memory leaks As an initial indicator, check the garbage collection metrics. If memory usage returns to the same level following GC, all is well. However, if memory rises continuously, you have a problem. The screenshot above shows a provoked memory leak.


1 Answers

I reproduced leak with jdk1.8.60 in Ubuntu too. I did quite some profiling and debugging and the core cause is simple and it can be fixed easily. No memory leak in the XPath stuff.

There is a class com.sun.webkit.Disposer, which is doing continuous cleanup of some internal structures that get created during the XPath evaluation. The disposer internaly calls the cleanup via Invoker.getInvoker().invokeOnEventThread(this);. You can see it if you decompile the code. There are different implementations of the invoker, using different threads. If you work within JavaFX, the Invoker performs the cleanup periodically in the JavaFX thread.

However, your changed listener method is also called in the JavaFX thread, and it never returns, so the cleanup has never a chance to occur.

I modified your code, so that the changed method only spawns a new thread and returns, and the processing is done asynchronously. And guess what - the memory does not grow any more:

@Override
public void changed(ObservableValue<? extends Worker.State> observable, Worker.State oldValue, Worker.State newValue) {
    if (newValue==Worker.State.SUCCEEDED) {
        new Thread(() ->{
            try {
                while(true){
                    NodeList eval = eval(engine.getDocument(), "//span[@class='title']");
                    int s = eval.getLength();
                }
            } catch (XPathExpressionException ex) {
                Logger.getLogger(MVC.class.getName()).log(Level.SEVERE, null, ex);
            }
        }).start();
    }
}
like image 126
Jan X Marek Avatar answered Oct 19 '22 17:10

Jan X Marek