Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URLClassLoader.getResources("") (empty resource name) not giving roots of jars

Consider a URLClassLoader parameterized with a collection of URLs which is a mix of expanded directories and jar files. For example:

URL[] urls = new URL[] {
    new URL("file:/D:/work/temp/jars/spring-security-core-3.2.0.RELEASE.jar"),
    new URL("file:/D:/work/temp/jars/spring-security-config-3.2.0.RELEASE.jar"),
    ...
    new URL("file:/D:/work/temp/domain/bin/"),
    new URL("file:/D:/work/temp/web/bin/"),
    ...
}
URLClassLoader cl = new URLClassLoader(urls);

The classloader correctly handles getResources() requests for resources located somewhere inside a package like "org/my/package/conf.properties". By correctly handles I mean the classloader successfully finds all matches inside both directories and jars.

A special empty string name passed in getResources("") is supposed to yield the URLs for all available roots (in both the directories and the jars). However there is a known limitation in ClassLoaders which results in only returning roots that correspond to directories. All roots to jars are discarded.

Using classloader.getURLs[] instead of classloader.getResources("") will not work with me as I have a complex graph of interdependent URLClassLoaders, so the results are going to be completely different. Also my classloaders are to be consumed by a third party classpath scanning facilities that uses getResources("") calls in order to set up an internal search base. This way resources located in jars are simply not found.

I currently have a working fix where I extend from URLClassLoader and manually handle requests with an empty string by forcing roots for jars in addition to those for directories within the returned collection of URLs.

However my questions are:

  • What was the conceptual/technical reason for this limitation (where paths to jars are not returned)?

  • By fixing this manually, do I violate any important contract?

  • Is there any nice way to get the desired behavior?

Thanks for any thoughts on that!

like image 671
Pavel S. Avatar asked Feb 17 '14 18:02

Pavel S.


1 Answers

What was the conceptual/technical reason for this limitation (where paths to jars are not returned)?

The behavior of ClassLoader.getResources("") is unspecified.

The implementation for loading resources from the file system in URLClassPath$Loader is solely based on URLs. It constructs a new file URL by adding the resource name to the base URL of the directory and returns the URL when it points to an existing resource. There's no special handling for an empty resource name. Whether this is wanted behavior or not is undocumented.

The implementation for JAR files in URLClassPath$JarLoader works on an index over JAR files. To get the same behaviour for JAR files the implementation would require a special handling for empty resource names, i.e. it would need to check for an empty resource name first and return the file URL of the JAR file instead of searching within the index. The implementation does not have a special handling for empy resource names. Whether this is wanted behavior or not is again undocumented.

Since the API specification does not specify the behavior for empty resource names both implementations are valid.

Some may argue that exposing roots is a security issues, especially when running in a sandbox. Others may argue that getResources() should return null for empty resources since there actually does not exist a resource with the name "".

In any case current behavior of URLClassLoader leads to unexpected behavior in Class.getResource(). When this method is called with an empty string for a class in the default package it returns the root directory of the class when the class was loaded from the file system. This violates the contract of the method. For details see for example this open Java bug: https://bugs.openjdk.java.net/browse/JDK-8202687.

By fixing this manually, do I violate any important contract?

As long as you only override the findResource() method of your ClassLoader, call the super method and then add the additional URLs of your JAR files you shouldn't violate any contract.

But be aware that there are already implementations out there, that have a special handling for URLClassLoaders. For example Spring's PathMatchingResourcePatternResolver has a special handling (here) for class loaders that are instances of URLClassLoader, which adds additional URLs for JARs.

Is there any nice way to get the desired behavior?

There is no nice way to get the desired behavior since every solution would be based on unspecified behavior that may theoretically change with every new JRE version.

With the introduction of multi-release JAR files in Java 9 the behavior already changed:

For a multi-release JAR file with Java 8 classes and Java 9 classes ClassLoader.getResource("") returns now an URL for the JAR file when it is executed within a JRE version > 8. With JRE 8 it still returns no URL for the same JAR file. With it, the returned URLs for an empty resource string depend now even on the JRE version, resp. the type of JAR file.

There exist workarounds to get also the URLs for JAR files. PathMachintResourcePatternResolver for example loads JAR file names from the java.class.path system property (in case of the system class loader) and loads additional URLs by calling URLClassLoader.getURLs() (in case of a URLClassLoader). But again, these are only workarounds based on unspecified behavior.

Ideally searches on the classpath are only performed in the context of a java package. Frameworks like Spring (boot) perform searches on the classpath only in the context of a java package. This avoids to rely on unspecified behavior of class loaders and also avoids to search in JAR files of irrelevant third-party libraries. So, whenever possible I recommend to search on the classpath in context of a java package instead of searching resources by using an empty resource name.

like image 51
rmunge Avatar answered Nov 15 '22 00:11

rmunge