Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Custom Java classloader not being used to load dependencies?

I've been trying to set up a custom classloader that intercepts classes to print out which classes are being loaded into the application. The classloader looks like this

public class MyClassLoader extends ClassLoader {
    @Override
    public Class<?> loadClass(String name) throws ClassNotFoundException {
        System.out.println("Loading: " + name);
        return super.loadClass(name);
    }
}     

It just spits out the name of all the classes it loads. However, when i try to run some code,

import org.python.util.PythonInterpreter;
public class Scripts {
    public String main(){

        PythonInterpreter p = new PythonInterpreter();
        p.exec("print 'Python ' + open('.gitignore').read()");

        return "Success! Nothing broke";
    }
}

via

MyClassLoader bcl = new MyClassLoader();
Class c = bcl.loadClass("Scripts");

Method m = c.getMethod("main");
String result = (String) m.invoke(c.getConstructor().newInstance());

it prints out

Loading: Scripts
Loading: java.lang.Object
Loading: java.lang.String
Loading: org.python.util.PythonInterpreter
Python build/
.idea/*
*.iml
RESULT: Success! Nothing broke

Which seems rather odd. org.python.util.PythonInterpreter is not a simple class, and it depends on a whole bunch of other classes in the org.python.util package. Those classes are clearly being loaded, for the exec'd python code is able to do stuff and read my file. For some reason, though, those classes are not being loaded by the classloader which loaded PythonInterpreter.

Why is that? I was under the impression that the classloader used to load a class C would be used to load all the other classes needed by C, but that's clearly not happening here. Is that assumption mistaken? If it is, how do i set it up such that all the transitive dependencies of C are loaded by my classloader?

EDIT:

Some experiments with using URLClassLoader, which was suggested. I modified the delegation in loadClass():

try{
    byte[] output = IOUtils.toByteArray(this.getResourceAsStream(name));
    return instrument(defineClass(name, output, 0, output.length));
}catch(Exception e){
    return instrument(super.loadClass(name));
}

as well as made MyClassLoader subclass URLClassLoader rather than plain ClassLoader, grabbing URLs via:

super(((URLClassLoader)ClassLoader.getSystemClassLoader()).getURLs());

But it doesn't seem to be the right thing. In particular, getResourceAsStream() is throwing nulls back at me for all the classes I'm requesting, even non-system classes like that Jython lib.

like image 632
Li Haoyi Avatar asked Nov 09 '12 20:11

Li Haoyi


2 Answers

Basics of Class Loading

There are two main places to extend a class loader to change the way classes are loaded:

  • findClass(String name) - You override this method when you want to find a class with the usual parent first delegation.
  • loadClass(String name, boolean resolve) - Override this method when you want to change the way that class loading delegation is done.

However, classes can only come from the final defineClass(...) methods provided by java.lang.ClassLoader. Since you would like to capture all of the classes that are loaded, we will need to override loadClass( String, boolean ) and use a call to defineClass(...) somewhere in it.

NOTE: Inside of the defineClass(...) methods, there is a JNI binding to the native side of the JVM. Inside of that code, there is a check for classes in the java.* packages. It will only let those classes be loaded by the system class loader. This prevents you from messing with the internals of Java itself.

An Example Child First ClassLoader

This is a very simple implementation of the ClassLoader that you are trying to create. It assumes that all of the classes you need are available to the parent class loader, so it just uses the parent as a source for class bytes. This implementation uses Apache Commons IO for brevity, but it could easily be removed.

import java.io.IOException;
import java.io.InputStream;

import static org.apache.commons.io.IOUtils.toByteArray;
import static org.apache.commons.io.IOUtils.closeQuietly;
...
public class MyClassLoader
  extends ClassLoader {
  MyClassLoaderListener listener;

  MyClassLoader(ClassLoader parent, MyClassLoaderListener listener) {
    super(parent);
    this.listener = listener;
  }

  @Override
  protected Class<?> loadClass(String name, boolean resolve)
    throws ClassNotFoundException {
    // respect the java.* packages.
    if( name.startsWith("java.")) {
      return super.loadClass(name, resolve);
    }
    else {
      // see if we have already loaded the class.
      Class<?> c = findLoadedClass(name);
      if( c != null ) return c;

      // the class is not loaded yet.  Since the parent class loader has all of the
      // definitions that we need, we can use it as our source for classes.
      InputStream in = null;
      try {
        // get the input stream, throwing ClassNotFound if there is no resource.
        in = getParent().getResourceAsStream(name.replaceAll("\\.", "/")+".class");
        if( in == null ) throw new ClassNotFoundException("Could not find "+name);

        // read all of the bytes and define the class.
        byte[] cBytes = toByteArray(in);
        c = defineClass(name, cBytes, 0, cBytes.length);
        if( resolve ) resolveClass(c);
        if( listener != null ) listener.classLoaded(c);
        return c;
      } catch (IOException e) {
        throw new ClassNotFoundException("Could not load "+name, e);
      }
      finally {
        closeQuietly(in);
      }
    }
  }
}

And this is a simple listener interface for watching classes load.

public interface MyClassLoaderListener {
  public void classLoaded( Class<?> c );
}

You can then create a new instance of MyClassLoader, with the current class loader as the parent, and monitor classes as they are loaded.

MyClassLoader classLoader = new MyClassLoader(this.getClass().getClassLoader(), new MyClassLoaderListener() {
  public void classLoaded(Class<?> c) {
    System.out.println(c.getName());
  }
});
classLoader.loadClass(...);

This will work in the most general case and will allow you to get notified when classes are loaded. However, if any of those classes create their own child first class loaders, then they could bypass the notification code added here.

More Advanced Class Loading

To really trap classes being loaded, even when a child class loader overrides loadClass(String, boolean), you have to insert code between the classes you are loading and any of the calls that they may make to ClassLoader.defineClass(...). To do this, you have to start getting into byte code rewriting with a tool like ASM. I have a project called Chlorine on GitHub that uses this method to rewrite java.net.URL constructor calls. If you are curious about messing with classes at load time, I would check that project out.

like image 80
Christian Trimble Avatar answered Oct 05 '22 21:10

Christian Trimble


If you want to print the classes as they are loaded, how about switching on the verbose:class option on the JVM?

java -verbose:class your.class.name.here

To answer your direct questions:

Why is that? I was under the impression that the classloader used to load a class C would be used to load all the other classes needed by C, but that's clearly not happening here. Is that assumption mistaken? If it is, how do i set it up such that all the transitive dependencies of C are loaded by my classloader?

While searching the ClassLoaders, the search is performed from the leaf ClassLoader to the root, when Java works out a new class has to be loaded, it is performed from the root of the ClassLoader tree back down to the leaf that initiated the class resolution.

Why? Consider if your custom class wanted to load something from the Java standard libraries. The correct answer is that this should be loaded by the System ClassLoader so that class can be maximally shared. Especially when you consider that the class being loaded would then potentially load a whole lot more classes.

This also solves the problem that potentially you could end up with multiple system Classes instances being loaded in different ClassLoaders - each with the same fully qualified name. EDIT Classes would be resolved correctly in their ClassLoader. However there are two problems.

  1. Let's say we have two String instances, a and b. a.getClass().isInstance(b) and a.getClass() == b.getClass() are not be true if a and b were instantiated in different ClassLoaders. This would cause horrific problems.
  2. Singletons: they would not be singletons - you can have one per ClassLoader.

END EDIT

One other observation: Just like you have set up a ClassLoader to specifically load classes from, interpreters often themselves create ClassLoader instances into which they load the interpreting environment and the script. That way, if the script changes, the ClassLoader can be dropped (and with it the script), and reloaded in a new ClassLoader. EJBs and Servlets also use this trick.

like image 42
Andrew Alcock Avatar answered Oct 05 '22 23:10

Andrew Alcock