Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FactoryFinder performance/bad caching

I've got a rather large java ee application with a huge classpath doing a lot of xml processing. Currently I am trying to speed up some of my functions and locating slow code paths via sampling profilers.

One thing I noticed is that especially parts of our code in which we have calls like TransformerFactory.newInstance(...) are desperately slow. I tracked this down to FactoryFinder method findServiceProvider always creating a new ServiceLoader instance. In ServiceLoader javadoc I found the following note about caching:

Providers are located and instantiated lazily, that is, on demand. A service loader maintains a cache of the providers that have been loaded so far. Each invocation of the iterator method returns an iterator that first yields all of the elements of the cache, in instantiation order, and then lazily locates and instantiates any remaining providers, adding each one to the cache in turn. The cache can be cleared via the reload method.

So far so good. This is a part of OpenJDKs FactoryFinder#findServiceProvider method:

private static <T> T findServiceProvider(final Class<T> type)
        throws TransformerFactoryConfigurationError
    {
      try {
            return AccessController.doPrivileged(new PrivilegedAction<T>() {
                public T run() {
                    final ServiceLoader<T> serviceLoader = ServiceLoader.load(type);
                    final Iterator<T> iterator = serviceLoader.iterator();
                    if (iterator.hasNext()) {
                        return iterator.next();
                    } else {
                        return null;
                    }
                 }
            });
        } catch(ServiceConfigurationError e) {
            ...
        }
    }

Every call to findServiceProvider calls ServiceLoader.load. This creates a new ServiceLoader each time. This way it seems that there is no use of ServiceLoaders caching mechanism at all. Every call scans the classpath for the requested ServiceProvider.

What I've already tried:

  1. I know you can set a system property like javax.xml.transform.TransformerFactory to specify a specific implementation. This way FactoryFinder does not use the ServiceLoader process and its super fast. Sadly this is a jvm wide property and affects other java processes running in my jvm. For example my application ships with Saxon and should use com.saxonica.config.EnterpriseTransformerFactory I've got another application which does not ship with Saxon. As soon as I set the system property, my other application fails to start, because there is no com.saxonica.config.EnterpriseTransformerFactory on its classpath. So this does not seem to be an option for me.
  2. I already refactored every place where a TransformerFactory.newInstance is called and cache the TransformerFactory. But there are various places in my dependencies where I can not refactor the code.

My questions is: Why does FactoryFinder not reuse a ServiceLoader? Is there a way to speed up this whole ServiceLoader process other than using system properties? Couldn't this be changed in the JDK so that a FactoryFinder reuses a ServiceLoader instance? Also this is not specific to a single FactoryFinder. This bahaviour is the same for all FactoryFinder classes in the javax.xml package i have looked at so far.

I am using OpenJDK 8/11. My applications are deployed in a Tomcat 9 instance.

Edit: Providing more details

Here is the call stack for a single XMLInputFactory.newInstance call: enter image description here

Where most resources are used is in ServiceLoaders$LazyIterator.hasNextService. This method calls getResources on ClassLoader to read the META-INF/services/javax.xml.stream.XMLInputFactory file. That call alone takes about 35ms each time.

Is there a way to instruct Tomcat to better cache these files so they are served faster?

like image 655
Wagner Michael Avatar asked Oct 14 '19 12:10

Wagner Michael


1 Answers

35 ms sounds like there is disc access times involved, and that points to a problem with OS cacheing.

If there any directory/non-jar entries on the classpath that can slow things down. Also if the resource isn't present at the first location that is checked.

ClassLoader.getResource can be overridden if you can set the thread context class loader, either through configuration (I haven't touched tomcat for years) or just Thread.setContextClassLoader.

like image 141
Tom Hawtin - tackline Avatar answered Oct 09 '22 20:10

Tom Hawtin - tackline