Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding a Java lambda from its mangled name in a heap dump

I'm hunting a memory leak, and the heap dump shows me a number of lambda instances are holding the offending objects. The name of the lambda is the surrounding class name with $$lambda$107 at the end. I can also see that it has a single field (it that is the right name for it), called arg$1 which references the objects filling up the heap. Unfortunately, I have quite a few lambdas in this class, and I wonder what I can do to narrow it down.

I'm assuming arg$1 is an implicit argument -- a free variable in the lambda expression that gets captured when the lambda becomes a closure. Is that correct?

I'm also guessing the 107 is no real help in isolation, but are there some flags I can set to log which lambda expression gets what number?

Any other useful tips?

like image 304
ExMathGuy Avatar asked Jan 10 '17 14:01

ExMathGuy


2 Answers

The OP's conjecture is correct that arg$1 is a field of a lambda object containing a captured value. The answer from lukeg is on the right track, in getting the lambda metafactory to dump its proxy classes. (+1)

Here's an approach that uses the javap tool to track the instance holding the reference back to the source code. Basically you find the right proxy class; disassemble it to find out which synthetic lambda method it calls; then associate that synthetic lambda method with a particular lambda expression in source code.

(Most, if not all of this information, applies to Oracle JDK and OpenJDK. It might not work for different JDK implementations. Also, this is subject to change in the future. This should work with any recent Oracle JDK 8 or OpenJDK 8, though. It will probably continue to work in JDK 9.)

First, a bit of background. When a source file containing lambdas is compiled, javac will compile the lambda bodies into synthetic methods that reside in the containing class. These methods are private and static, and their names will be something like lambda$<method>$<count> where method is the name of the method that contains the lambda, and count is a sequential counter that numbers methods from the beginning of the source file (starting from zero).

When a lambda expression is first evaluated at runtime, the lambda metafactory is called. This produces a class that implements the lambda's functional interface. It instantiates this class, takes the arguments to the functional interface method (if any), combines them with any captured values, and calls the synthetic method compiled by javac as described above. This instance is referred to as a "function object" or a "proxy".

By getting the lambda metafactory to dump its proxy classes, you can use javap to disassemble the bytecodes and to trace a proxy instance back to the lambda expression for which it was generated. This is probably best illustrated by an example. Consider the following code:

public class CaptureTest {
    static List<IntSupplier> list;

    static IntSupplier foo(boolean b, Object o) {
        if (b) {
            return () -> 0;                      // line 20
        } else {
            int h = o.hashCode();
            return () -> h;                      // line 23
        }
    }

    static IntSupplier bar(boolean b, Object o) {
        if (b) {
            return () -> o.hashCode();           // line 29
        } else {
            int len = o.toString().length();
            return () -> len;                    // line 32
        }
    }

    static void run() {
        Object big = new byte[10_000_000];

        list = Arrays.asList(
            bar(false, big),
            bar(true,  big),
            foo(false, big),
            foo(true,  big));

        System.out.println("Done.");
    }

    public static void main(String[] args) throws InterruptedException {
        run();
        Thread.sleep(Long.MAX_VALUE); // stay alive so a heap dump can be taken
    }
}

This code allocates a large array and then evaluates four different lambda expressions. One of these captures a reference to the large array. (You can tell by inspection if you know what you're looking for, but sometimes this is hard.) Which lambda is doing the capturing?

The first thing to do is to compile this class and to run javap -v -p CaptureTest. The -v option shows disassembled bytecode and other information such as the line number tables. The -p option must be supplied in order to get javap to disassemble private methods. The output of this includes a lot of stuff, but the important parts are the synthetic lambda methods:

private static int lambda$bar$3(int);
  descriptor: (I)I
  flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
  Code:
    stack=1, locals=1, args_size=1
       0: iload_0
       1: ireturn
    LineNumberTable:
      line 32: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       2     0   len   I

private static int lambda$bar$2(java.lang.Object);
  descriptor: (Ljava/lang/Object;)I
  flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
  Code:
    stack=1, locals=1, args_size=1
       0: aload_0
       1: invokevirtual #3                  // Method java/lang/Object.hashCode:()I
       4: ireturn
    LineNumberTable:
      line 29: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       5     0     o   Ljava/lang/Object;

private static int lambda$foo$1(int);
  descriptor: (I)I
  flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
  Code:
    stack=1, locals=1, args_size=1
       0: iload_0
       1: ireturn
    LineNumberTable:
      line 23: 0
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
          0       2     0     h   I

private static int lambda$foo$0();
  descriptor: ()I
  flags: ACC_PRIVATE, ACC_STATIC, ACC_SYNTHETIC
  Code:
    stack=1, locals=0, args_size=0
       0: iconst_0
       1: ireturn
    LineNumberTable:
      line 20: 0

The counter at the end of the method names starts at zero and is numbered sequentially from the beginning of the file. In addition, the synthetic method name includes the name of the method that contains the lambda expression, so we can tell which method was generated from each of several lambdas that occur within a single method.

Then, run the program under a memory profiler, supplying the command-line argument -Djdk.internal.lambda.dumpProxyClasses=<outputdir> to the java command. This causes the lambda metafactory to dump its generated classes to the named directory (which must already exist).

Get a memory profile of the application and inspect it. There are a variety of ways to do this; I used the NetBeans memory profiler. When I ran it, it told me that a byte[] with 10,000,000 elements was held by a field arg$1 in a class named CaptureTest$$Lambda$9. This is as far as the OP got.

The counter on this class name isn't useful, as it represents a sequence number of classes generated by the lambda metafactory, in the order they were generated at runtime. Knowing the runtime sequence doesn't tell us very much about where it originated in the source code.

However, we've asked the lambda metafactory to dump its classes, so we can go look at this particular class to see what it does. Indeed, in the output directory, there is a file CaptureTest$$Lambda$9.class. Running javap -c on it reveals the following:

final class CaptureTest$$Lambda$9 implements java.util.function.IntSupplier {
  public int getAsInt();
    Code:
       0: aload_0
       1: getfield      #15                 // Field arg$1:Ljava/lang/Object;
       4: invokestatic  #28                 // Method CaptureTest.lambda$bar$2:(Ljava/lang/Object;)I
       7: ireturn
}

You can decompile the constant pool entries, but javap helpfully puts symbolic names in comments to the right of the bytecodes. You can see that this loads the arg$1 field -- the offending reference -- and passes it to the method CaptureTest.lambda$bar$2. This is lambda number 2 (starting from zero) in our source file, and it's the first of two lambda expressions within the bar() method. Now you can go back to the javap output of the original class and use the line number information from the lambda static method to find the location in the source file. The line number information of the CaptureTest.lambda$bar$2 method points to line 29. The lambda at this location is

    () -> o.hashCode()

where o is a free variable, which is a capture of one of the arguments to the bar() method.

like image 197
Stuart Marks Avatar answered Sep 28 '22 02:09

Stuart Marks


This is a little convoluted, but you may try:

  • starting your JVM with -Djdk.internal.lambda.dumpProxyClasses=/path/to/directory/. The option will make JVM dump generated proxy objects (class files) to a directory of your choice

  • you can try decompiling thus generated classes. I have created a sample Java code that used lambdas and then opened one of the generated class files (file named Test$$Lambda$3.class) in Intellij Idea and it has been decompiled to:

    import java.util.function.IntPredicate;
    
    // $FF: synthetic class
    final class Test$$Lambda$3 implements IntPredicate {
        private Test$$Lambda$3() { 
        }
    
        public boolean test(int var1) {
            return Test.lambda$bar$1(var1);
        }
    }
    
  • from there you can infer type of lambda (IntPredicate in the example), name of class it was defined in (Test) and name of method it was defined in (bar).

like image 37
lukeg Avatar answered Sep 28 '22 02:09

lukeg