I have a 32-bit Java service with scalability problems: with high user count we run out of memory because of excessive thread count. In the long term, I plan to switch to 64-bit and to reduce the threads-per-user ratio. In the short term, I'd like to reduce the stack size (-Xss, -XX:ThreadStackSize) to get some more headroom. But this is risky because if I make it too small, I'm going to get StackOverflowErrors. How can I measure the average and maximum stack size for my application to guide my decision for an optimal -Xss value? I'm interested in two possible approaches: <ol> <li>Measuring a running JVM during integration testing. What profiling tools will report max stack depth?</li> <li>Static analysis of the application looking for deep call hierarchies. Reflection in dependency injection makes it unlikely that this would work.</li> </ol> Update: I know the long-term right way to fix this problem. Please focus on the question I've asked: how do I measure stack depth? Update 2: I got a nice answer on a related question specifically about JProfiler: Can JProfiler measure stack depth? (I posted the separate question as per JProfiler's community support recommendations)

You can get an idea of the stack depth with something like an aspect that can be woven to your code (load time weaver to allow advising all loaded code except system class loader). The aspect would work around all executed code and would be able to note when you are calling a method and when you return. You can use this to capture most of your stack usage (you'll miss anything loaded from the system class loader, e.g. java.*). While not perfect, it avoids having to change your code to gather StackTraceElement[] at sample points and also gets you into non-jdk code that you might not have written. For example (aspectj): <pre class="prettyprint"><code>public aspect CallStackAdvice { pointcut allMethods() : execution(* *(..)) && !within(CallStackLog); Object around(): allMethods(){ String called = thisJoinPoint.getSignature ().toLongString (); CallStackLog.calling ( called ); try { return proceed(); } finally { CallStackLog.exiting ( called ); } } } public class CallStackLog { private CallStackLog () {} private static ThreadLocal<ArrayDeque<String>> curStack = new ThreadLocal<ArrayDeque<String>> () { @Override protected ArrayDeque<String> initialValue () { return new ArrayDeque<String> (); } }; private static ThreadLocal<Boolean> ascending = new ThreadLocal<Boolean> () { @Override protected Boolean initialValue () { return true; } }; private static ConcurrentHashMap<Integer, ArrayDeque<String>> stacks = new ConcurrentHashMap<Integer, ArrayDeque<String>> (); public static void calling ( String signature ) { ascending.set ( true ); curStack.get ().push ( signature.intern () ); } public static void exiting ( String signature ) { ArrayDeque<String> cur = curStack.get (); if ( ascending.get () ) { ArrayDeque<String> clon = cur.clone (); stacks.put ( hash ( clon ), clon ); } cur.pop (); ascending.set ( false ); } public static Integer hash ( ArrayDeque<String> a ) { //simplistic and wrong but ok for example int h = 0; for ( String s : a ) { h += ( 31 * s.hashCode () ); } return h; } public static void dumpStacks(){ //implement something to print or retrieve or use stacks } } </code></pre> And a sample stack might be like: <pre class="prettyprint"><code>net.sourceforge.jtds.jdbc.TdsCore net.sourceforge.jtds.jdbc.JtdsStatement.getTds() public boolean net.sourceforge.jtds.jdbc.JtdsResultSet.next() public void net.sourceforge.jtds.jdbc.JtdsResultSet.close() public java.sql.Connection net.sourceforge.jtds.jdbc.Driver.connect(java.lang.String, java.util.Properties) public void phil.RandomStackGen.MyRunnable.run() </code></pre> Very slow and has its own memory issues but can be workable to get you the stack information you need. You can then use the max_stack and max_locals for each method in your stack traces to compute a frame size (see class file format) for the method. Based on the vm spec I believe this should be (max_stack+max_locals)*4bytes for the max frame size for a method (long/double occupy two entries on the operand stack/local vars and is accounted for in max_stack and max_locals). You can easily javap the classes of interest and see the frame values if you don't have that much in your call stacks. And something like asm provides you with some easy tools to use to do this on a larger scale. Once you have this computed, you need to estimate additional stack frames for JDK classes that might be called by you at your max stack points and add that to your stack sizes. It wont be perfect but it ought to get you a decent starting point for -Xss tuning without hacking around the JVM/JDK. One other note: I don't know what JIT/OSR does to frame sizes or stack requirements so do be aware that you may have different impacts from -Xss tuning on a cold vs. warm JVM. EDIT had a few hours of down time and threw together another approach. This is a java agent that will instrument methods to keep track of a max stack frame size and stack depth. This will be able to instrument most of the jdk classes along with your other code and libraries, giving you better results than the aspect weaver. You need asm v4 for this to work. It was more for the fun of it so file this under plinking java for fun, not profit. First, make something to track the stack frame size and depth: <pre class="prettyprint"><code>package phil.agent; public class MaxStackLog { private static ThreadLocal<Integer> curStackSize = new ThreadLocal<Integer> () { @Override protected Integer initialValue () { return 0; } }; private static ThreadLocal<Integer> curStackDepth = new ThreadLocal<Integer> () { @Override protected Integer initialValue () { return 0; } }; private static ThreadLocal<Boolean> ascending = new ThreadLocal<Boolean> () { @Override protected Boolean initialValue () { return true; } }; private static ConcurrentHashMap<Long, Integer> maxSizes = new ConcurrentHashMap<Long, Integer> (); private static ConcurrentHashMap<Long, Integer> maxDepth = new ConcurrentHashMap<Long, Integer> (); private MaxStackLog () { } public static void enter ( int frameSize ) { ascending.set ( true ); curStackSize.set ( curStackSize.get () + frameSize ); curStackDepth.set ( curStackDepth.get () + 1 ); } public static void exit ( int frameSize ) { int cur = curStackSize.get (); int curDepth = curStackDepth.get (); if ( ascending.get () ) { long id = Thread.currentThread ().getId (); Integer max = maxSizes.get ( id ); if ( max == null || cur > max ) { maxSizes.put ( id, cur ); } max = maxDepth.get ( id ); if ( max == null || curDepth > max ) { maxDepth.put ( id, curDepth ); } } ascending.set ( false ); curStackSize.set ( cur - frameSize ); curStackDepth.set ( curDepth - 1 ); } public static void dumpMax () { int max = 0; for ( int i : maxSizes.values () ) { max = Math.max ( i, max ); } System.out.println ( "Max stack frame size accummulated: " + max ); max = 0; for ( int i : maxDepth.values () ) { max = Math.max ( i, max ); } System.out.println ( "Max stack depth: " + max ); } } </code></pre> Next, make the java agent: <pre class="prettyprint"><code>package phil.agent; public class Agent { public static void premain ( String agentArguments, Instrumentation ins ) { try { ins.appendToBootstrapClassLoaderSearch ( new JarFile ( new File ( "path/to/Agent.jar" ) ) ); } catch ( IOException e ) { e.printStackTrace (); } ins.addTransformer ( new Transformer (), true ); Class<?>[] classes = ins.getAllLoadedClasses (); int len = classes.length; for ( int i = 0; i < len; i++ ) { Class<?> clazz = classes[i]; String name = clazz != null ? clazz.getCanonicalName () : null; try { if ( name != null && !clazz.isArray () && !clazz.isPrimitive () && !clazz.isInterface () && !name.equals ( "java.lang.Long" ) && !name.equals ( "java.lang.Boolean" ) && !name.equals ( "java.lang.Integer" ) && !name.equals ( "java.lang.Double" ) && !name.equals ( "java.lang.Float" ) && !name.equals ( "java.lang.Number" ) && !name.equals ( "java.lang.Class" ) && !name.equals ( "java.lang.Byte" ) && !name.equals ( "java.lang.Void" ) && !name.equals ( "java.lang.Short" ) && !name.equals ( "java.lang.System" ) && !name.equals ( "java.lang.Runtime" ) && !name.equals ( "java.lang.Compiler" ) && !name.equals ( "java.lang.StackTraceElement" ) && !name.startsWith ( "java.lang.ThreadLocal" ) && !name.startsWith ( "sun." ) && !name.startsWith ( "java.security." ) && !name.startsWith ( "java.lang.ref." ) && !name.startsWith ( "java.lang.ClassLoader" ) && !name.startsWith ( "java.util.concurrent.atomic" ) && !name.startsWith ( "java.util.concurrent.ConcurrentHashMap" ) && !name.startsWith ( "java.util.concurrent.locks." ) && !name.startsWith ( "phil.agent." ) ) { ins.retransformClasses ( clazz ); } } catch ( Throwable e ) { System.err.println ( "Cant modify: " + name ); } } Runtime.getRuntime ().addShutdownHook ( new Thread () { @Override public void run () { MaxStackLog.dumpMax (); } } ); } } </code></pre> The agent class has the premain hook for instrumentation. In that hook, it adds a class transformer that instruments in the stack frame size tracking. It also adds the agent to the boot class loader so that it can process jdk classes, too. To do that, we need to retransform anything that might be loaded already, like String.class. But, we have to exclude a variety of things that are used by the agent or the stack logging which lead to infinite loops or other problems (some of that was found by trial and error). Finally, the agent adds a shutdown hook to dump the results to stdout. <pre class="prettyprint"><code>public class Transformer implements ClassFileTransformer { @Override public byte[] transform ( ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer ) throws IllegalClassFormatException { if ( className.startsWith ( "phil/agent" ) ) { return classfileBuffer; } byte[] result = classfileBuffer; ClassReader reader = new ClassReader ( classfileBuffer ); MaxStackClassVisitor maxCv = new MaxStackClassVisitor ( null ); reader.accept ( maxCv, ClassReader.SKIP_DEBUG ); ClassWriter writer = new ClassWriter ( ClassWriter.COMPUTE_FRAMES ); ClassVisitor visitor = new CallStackClassVisitor ( writer, maxCv.frameMap, className ); reader.accept ( visitor, ClassReader.SKIP_DEBUG ); result = writer.toByteArray (); return result; } } </code></pre> The transformer drives two separate transformations - one to figure out the max stack frame size for each method and one to instrument the method for recording. It might be doable in a single pass but I didn't want to use the ASM tree API or spend more time figuring it out. <pre class="prettyprint"><code>public class MaxStackClassVisitor extends ClassVisitor { Map<String, Integer> frameMap = new HashMap<String, Integer> (); public MaxStackClassVisitor ( ClassVisitor v ) { super ( Opcodes.ASM4, v ); } @Override public MethodVisitor visitMethod ( int access, String name, String desc, String signature, String[] exceptions ) { return new MaxStackMethodVisitor ( super.visitMethod ( access, name, desc, signature, exceptions ), this, ( access + name + desc + signature ) ); } } public class MaxStackMethodVisitor extends MethodVisitor { final MaxStackClassVisitor cv; final String name; public MaxStackMethodVisitor ( MethodVisitor mv, MaxStackClassVisitor cv, String name ) { super ( Opcodes.ASM4, mv ); this.cv = cv; this.name = name; } @Override public void visitMaxs ( int maxStack, int maxLocals ) { cv.frameMap.put ( name, ( maxStack + maxLocals ) * 4 ); super.visitMaxs ( maxStack, maxLocals ); } } </code></pre> The MaxStack*Visitor classes handle figuring out the max stack frame size. <pre class="prettyprint"><code>public class CallStackClassVisitor extends ClassVisitor { final Map<String, Integer> frameSizes; final String className; public CallStackClassVisitor ( ClassVisitor v, Map<String, Integer> frameSizes, String className ) { super ( Opcodes.ASM4, v ); this.frameSizes = frameSizes; this.className = className; } @Override public MethodVisitor visitMethod ( int access, String name, String desc, String signature, String[] exceptions ) { MethodVisitor m = super.visitMethod ( access, name, desc, signature, exceptions ); return new CallStackMethodVisitor ( m, frameSizes.get ( access + name + desc + signature ) ); } } public class CallStackMethodVisitor extends MethodVisitor { final int size; public CallStackMethodVisitor ( MethodVisitor mv, int size ) { super ( Opcodes.ASM4, mv ); this.size = size; } @Override public void visitCode () { visitIntInsn ( Opcodes.SIPUSH, size ); visitMethodInsn ( Opcodes.INVOKESTATIC, "phil/agent/MaxStackLog", "enter", "(I)V" ); super.visitCode (); } @Override public void visitInsn ( int inst ) { switch ( inst ) { case Opcodes.ARETURN: case Opcodes.DRETURN: case Opcodes.FRETURN: case Opcodes.IRETURN: case Opcodes.LRETURN: case Opcodes.RETURN: case Opcodes.ATHROW: visitIntInsn ( Opcodes.SIPUSH, size ); visitMethodInsn ( Opcodes.INVOKESTATIC, "phil/agent/MaxStackLog", "exit", "(I)V" ); break; default: break; } super.visitInsn ( inst ); } } </code></pre> The CallStack*Visitor classes handle instrumenting methods with code to call the stack frame logging. And then you need a MANIFEST.MF for the Agent.jar: <pre class="prettyprint"><code>Manifest-Version: 1.0 Premain-Class: phil.agent.Agent Boot-Class-Path: asm-all-4.0.jar Can-Retransform-Classes: true </code></pre> Finally, add the following to your java command line for the program you want to instrument: <pre class="prettyprint"><code>-javaagent:path/to/Agent.jar </code></pre> You will also need to have the asm-all-4.0.jar in the same directory as the Agent.jar (or change Boot-Class-Path in the manifest to reference the location). A sample output might be: <pre class="prettyprint"><code>Max stack frame size accummulated: 44140 Max stack depth: 1004 </code></pre> This is all a bit crude but works for me to get going. Note: the stack frame size isn't a total stack size (still don't really know how to get that one). In practice, there are a variety of overheads for the thread stack. I found that I usually needed between 2 and 3 times the reported stack max frame size as a -Xss value. Oh, and be sure to do the -Xss tuning without the agent loaded as it adds to your stack size requirements.

How can I measure thread stack depth?

Tags:

java

multithreading

profiling

I have a 32-bit Java service with scalability problems: with high user count we run out of memory because of excessive thread count. In the long term, I plan to switch to 64-bit and to reduce the threads-per-user ratio. In the short term, I'd like to reduce the stack size (-Xss, -XX:ThreadStackSize) to get some more headroom. But this is risky because if I make it too small, I'm going to get StackOverflowErrors.

How can I measure the average and maximum stack size for my application to guide my decision for an optimal -Xss value? I'm interested in two possible approaches:

Measuring a running JVM during integration testing. What profiling tools will report max stack depth?
Static analysis of the application looking for deep call hierarchies. Reflection in dependency injection makes it unlikely that this would work.

Update: I know the long-term right way to fix this problem. Please focus on the question I've asked: how do I measure stack depth?

Update 2: I got a nice answer on a related question specifically about JProfiler: Can JProfiler measure stack depth? (I posted the separate question as per JProfiler's community support recommendations)

606

asked Nov 30 '11 15:11

Chris Dolan

1 Answers

You can get an idea of the stack depth with something like an aspect that can be woven to your code (load time weaver to allow advising all loaded code except system class loader). The aspect would work around all executed code and would be able to note when you are calling a method and when you return. You can use this to capture most of your stack usage (you'll miss anything loaded from the system class loader, e.g. java.*). While not perfect, it avoids having to change your code to gather StackTraceElement[] at sample points and also gets you into non-jdk code that you might not have written.

For example (aspectj):

public aspect CallStackAdvice {

   pointcut allMethods() : execution(* *(..)) && !within(CallStackLog);

   Object around(): allMethods(){
       String called = thisJoinPoint.getSignature ().toLongString ();
       CallStackLog.calling ( called );
       try {
           return proceed();
       } finally {
           CallStackLog.exiting ( called );
       }
   }

}

public class CallStackLog {

    private CallStackLog () {}

    private static ThreadLocal<ArrayDeque<String>> curStack = 
        new ThreadLocal<ArrayDeque<String>> () {
        @Override
        protected ArrayDeque<String> initialValue () {
            return new ArrayDeque<String> ();
        }
    };

    private static ThreadLocal<Boolean> ascending = 
        new ThreadLocal<Boolean> () {
        @Override
        protected Boolean initialValue () {
            return true;
        }
    };

    private static ConcurrentHashMap<Integer, ArrayDeque<String>> stacks = 
         new ConcurrentHashMap<Integer, ArrayDeque<String>> ();

    public static void calling ( String signature ) {
        ascending.set ( true );
        curStack.get ().push ( signature.intern () );
    }

    public static void exiting ( String signature ) {
        ArrayDeque<String> cur = curStack.get ();
        if ( ascending.get () ) {
            ArrayDeque<String> clon = cur.clone ();
            stacks.put ( hash ( clon ), clon );
        }
        cur.pop ();
        ascending.set ( false );
    }

    public static Integer hash ( ArrayDeque<String> a ) {
        //simplistic and wrong but ok for example
        int h = 0;
        for ( String s : a ) {
            h += ( 31 * s.hashCode () );
        }
        return h;
    }

    public static void dumpStacks(){
        //implement something to print or retrieve or use stacks
    }
}

And a sample stack might be like:

net.sourceforge.jtds.jdbc.TdsCore net.sourceforge.jtds.jdbc.JtdsStatement.getTds()
public boolean net.sourceforge.jtds.jdbc.JtdsResultSet.next()
public void net.sourceforge.jtds.jdbc.JtdsResultSet.close()
public java.sql.Connection net.sourceforge.jtds.jdbc.Driver.connect(java.lang.String, java.util.Properties)
public void phil.RandomStackGen.MyRunnable.run()

Very slow and has its own memory issues but can be workable to get you the stack information you need.

You can then use the max_stack and max_locals for each method in your stack traces to compute a frame size (see class file format) for the method. Based on the vm spec I believe this should be (max_stack+max_locals)*4bytes for the max frame size for a method (long/double occupy two entries on the operand stack/local vars and is accounted for in max_stack and max_locals).

You can easily javap the classes of interest and see the frame values if you don't have that much in your call stacks. And something like asm provides you with some easy tools to use to do this on a larger scale.

Once you have this computed, you need to estimate additional stack frames for JDK classes that might be called by you at your max stack points and add that to your stack sizes. It wont be perfect but it ought to get you a decent starting point for -Xss tuning without hacking around the JVM/JDK.

One other note: I don't know what JIT/OSR does to frame sizes or stack requirements so do be aware that you may have different impacts from -Xss tuning on a cold vs. warm JVM.

EDIT had a few hours of down time and threw together another approach. This is a java agent that will instrument methods to keep track of a max stack frame size and stack depth. This will be able to instrument most of the jdk classes along with your other code and libraries, giving you better results than the aspect weaver. You need asm v4 for this to work. It was more for the fun of it so file this under plinking java for fun, not profit.

First, make something to track the stack frame size and depth:

package phil.agent;

public class MaxStackLog {

    private static ThreadLocal<Integer> curStackSize = 
        new ThreadLocal<Integer> () {
        @Override
        protected Integer initialValue () {
            return 0;
        }
    };

    private static ThreadLocal<Integer> curStackDepth = 
        new ThreadLocal<Integer> () {
        @Override
        protected Integer initialValue () {
            return 0;
        }
    };

    private static ThreadLocal<Boolean> ascending = 
        new ThreadLocal<Boolean> () {
        @Override
        protected Boolean initialValue () {
            return true;
        }
    };

    private static ConcurrentHashMap<Long, Integer> maxSizes = 
        new ConcurrentHashMap<Long, Integer> ();
    private static ConcurrentHashMap<Long, Integer> maxDepth = 
        new ConcurrentHashMap<Long, Integer> ();

    private MaxStackLog () { }

    public static void enter ( int frameSize ) {
        ascending.set ( true );
        curStackSize.set ( curStackSize.get () + frameSize );
        curStackDepth.set ( curStackDepth.get () + 1 );
    }

    public static void exit ( int frameSize ) {
        int cur = curStackSize.get ();
        int curDepth = curStackDepth.get ();
        if ( ascending.get () ) {
            long id = Thread.currentThread ().getId ();
            Integer max = maxSizes.get ( id );
            if ( max == null || cur > max ) {
                maxSizes.put ( id, cur );
            }
            max = maxDepth.get ( id );
            if ( max == null || curDepth > max ) {
                maxDepth.put ( id, curDepth );
            }
        }
        ascending.set ( false );
        curStackSize.set ( cur - frameSize );
        curStackDepth.set ( curDepth - 1 );
    }

    public static void dumpMax () {
        int max = 0;
        for ( int i : maxSizes.values () ) {
            max = Math.max ( i, max );
        }
        System.out.println ( "Max stack frame size accummulated: " + max );
        max = 0;
        for ( int i : maxDepth.values () ) {
            max = Math.max ( i, max );
        }
        System.out.println ( "Max stack depth: " + max );
    }
}

Next, make the java agent:

package phil.agent;

public class Agent {

    public static void premain ( String agentArguments, Instrumentation ins ) {
        try {
            ins.appendToBootstrapClassLoaderSearch ( 
                new JarFile ( 
                    new File ( "path/to/Agent.jar" ) ) );
        } catch ( IOException e ) {
            e.printStackTrace ();
        }
        ins.addTransformer ( new Transformer (), true );
        Class<?>[] classes = ins.getAllLoadedClasses ();
        int len = classes.length;
        for ( int i = 0; i < len; i++ ) {
            Class<?> clazz = classes[i];
            String name = clazz != null ? clazz.getCanonicalName () : null;
            try {
                if ( name != null && !clazz.isArray () && !clazz.isPrimitive ()
                        && !clazz.isInterface () 
                        && !name.equals ( "java.lang.Long" )
                        && !name.equals ( "java.lang.Boolean" )
                        && !name.equals ( "java.lang.Integer" )
                        && !name.equals ( "java.lang.Double" ) 
                        && !name.equals ( "java.lang.Float" )
                        && !name.equals ( "java.lang.Number" ) 
                        && !name.equals ( "java.lang.Class" )
                        && !name.equals ( "java.lang.Byte" ) 
                        && !name.equals ( "java.lang.Void" )
                        && !name.equals ( "java.lang.Short" ) 
                        && !name.equals ( "java.lang.System" )
                        && !name.equals ( "java.lang.Runtime" )
                        && !name.equals ( "java.lang.Compiler" )
                        && !name.equals ( "java.lang.StackTraceElement" )
                        && !name.startsWith ( "java.lang.ThreadLocal" )
                        && !name.startsWith ( "sun." ) 
                        && !name.startsWith ( "java.security." )
                        && !name.startsWith ( "java.lang.ref." )
                        && !name.startsWith ( "java.lang.ClassLoader" )
                        && !name.startsWith ( "java.util.concurrent.atomic" )
                        && !name.startsWith ( "java.util.concurrent.ConcurrentHashMap" )
                        && !name.startsWith ( "java.util.concurrent.locks." )
                        && !name.startsWith ( "phil.agent." ) ) {
                    ins.retransformClasses ( clazz );
                }
            } catch ( Throwable e ) {
                System.err.println ( "Cant modify: " + name );
            }
        }

        Runtime.getRuntime ().addShutdownHook ( new Thread () {
            @Override
            public void run () {
                MaxStackLog.dumpMax ();
            }
        } );
    }
}

The agent class has the premain hook for instrumentation. In that hook, it adds a class transformer that instruments in the stack frame size tracking. It also adds the agent to the boot class loader so that it can process jdk classes, too. To do that, we need to retransform anything that might be loaded already, like String.class. But, we have to exclude a variety of things that are used by the agent or the stack logging which lead to infinite loops or other problems (some of that was found by trial and error). Finally, the agent adds a shutdown hook to dump the results to stdout.

public class Transformer implements ClassFileTransformer {

    @Override
    public byte[] transform ( ClassLoader loader, 
        String className, Class<?> classBeingRedefined,
            ProtectionDomain protectionDomain, byte[] classfileBuffer )
            throws IllegalClassFormatException {

        if ( className.startsWith ( "phil/agent" ) ) {
            return classfileBuffer;
        }

        byte[] result = classfileBuffer;
        ClassReader reader = new ClassReader ( classfileBuffer );
        MaxStackClassVisitor maxCv = new MaxStackClassVisitor ( null );
        reader.accept ( maxCv, ClassReader.SKIP_DEBUG );

        ClassWriter writer = new ClassWriter ( ClassWriter.COMPUTE_FRAMES );
        ClassVisitor visitor = 
            new CallStackClassVisitor ( writer, maxCv.frameMap, className );
        reader.accept ( visitor, ClassReader.SKIP_DEBUG );
        result = writer.toByteArray ();
        return result;
    }
}

The transformer drives two separate transformations - one to figure out the max stack frame size for each method and one to instrument the method for recording. It might be doable in a single pass but I didn't want to use the ASM tree API or spend more time figuring it out.

public class MaxStackClassVisitor extends ClassVisitor {

    Map<String, Integer> frameMap = new HashMap<String, Integer> ();

    public MaxStackClassVisitor ( ClassVisitor v ) {
        super ( Opcodes.ASM4, v );
    }

    @Override
    public MethodVisitor visitMethod ( int access, String name, 
        String desc, String signature,
            String[] exceptions ) {
        return new MaxStackMethodVisitor ( 
            super.visitMethod ( access, name, desc, signature, exceptions ), 
            this, ( access + name + desc + signature ) );
    }
}

public class MaxStackMethodVisitor extends MethodVisitor {

    final MaxStackClassVisitor cv;
    final String name;

    public MaxStackMethodVisitor ( MethodVisitor mv, 
        MaxStackClassVisitor cv, String name ) {
        super ( Opcodes.ASM4, mv );
        this.cv = cv;
        this.name = name;
    }

    @Override
    public void visitMaxs ( int maxStack, int maxLocals ) {
        cv.frameMap.put ( name, ( maxStack + maxLocals ) * 4 );
        super.visitMaxs ( maxStack, maxLocals );
    }
}

The MaxStack*Visitor classes handle figuring out the max stack frame size.

public class CallStackClassVisitor extends ClassVisitor {

    final Map<String, Integer> frameSizes;
    final String className;

    public CallStackClassVisitor ( ClassVisitor v, 
        Map<String, Integer> frameSizes, String className ) {
        super ( Opcodes.ASM4, v );
        this.frameSizes = frameSizes;
        this.className = className;
    }

    @Override
    public MethodVisitor visitMethod ( int access, String name, 
        String desc, String signature, String[] exceptions ) {
        MethodVisitor m = super.visitMethod ( access, name, desc, 
                             signature, exceptions );
        return new CallStackMethodVisitor ( m, 
                 frameSizes.get ( access + name + desc + signature ) );
    }
}

public class CallStackMethodVisitor extends MethodVisitor {

    final int size;

    public CallStackMethodVisitor ( MethodVisitor mv, int size ) {
        super ( Opcodes.ASM4, mv );
        this.size = size;
    }

    @Override
    public void visitCode () {
        visitIntInsn ( Opcodes.SIPUSH, size );
        visitMethodInsn ( Opcodes.INVOKESTATIC, "phil/agent/MaxStackLog",
                "enter", "(I)V" );
        super.visitCode ();
    }

    @Override
    public void visitInsn ( int inst ) {
        switch ( inst ) {
            case Opcodes.ARETURN:
            case Opcodes.DRETURN:
            case Opcodes.FRETURN:
            case Opcodes.IRETURN:
            case Opcodes.LRETURN:
            case Opcodes.RETURN:
            case Opcodes.ATHROW:
                visitIntInsn ( Opcodes.SIPUSH, size );
                visitMethodInsn ( Opcodes.INVOKESTATIC,
                        "phil/agent/MaxStackLog", "exit", "(I)V" );
                break;
            default:
                break;
        }

        super.visitInsn ( inst );
    }
}

The CallStack*Visitor classes handle instrumenting methods with code to call the stack frame logging.

And then you need a MANIFEST.MF for the Agent.jar:

Manifest-Version: 1.0
Premain-Class: phil.agent.Agent
Boot-Class-Path: asm-all-4.0.jar
Can-Retransform-Classes: true

Finally, add the following to your java command line for the program you want to instrument:

-javaagent:path/to/Agent.jar

You will also need to have the asm-all-4.0.jar in the same directory as the Agent.jar (or change Boot-Class-Path in the manifest to reference the location).

A sample output might be:

Max stack frame size accummulated: 44140
Max stack depth: 1004

This is all a bit crude but works for me to get going.

Note: the stack frame size isn't a total stack size (still don't really know how to get that one). In practice, there are a variety of overheads for the thread stack. I found that I usually needed between 2 and 3 times the reported stack max frame size as a -Xss value. Oh, and be sure to do the -Xss tuning without the agent loaded as it adds to your stack size requirements.

184

answered Sep 28 '22 19:09

philwb

Related questions
                            
                                Improving performance of fuzzy string matching against a dictionary
                            
                                How to change log4j setting during application run?
                            
                                Implementing my own remote desktop in java
                            
                                Maven: keeping dependent jars in project version control
                            
                                A good working example of Selenium2 and webdriver
                            
                                Porting AWT graphics code to Android
                            
                                need flexible datetime conversion with joda
                            
                                What is the best resizable byte buffer available in Java?
                            
                                Mockito refuses to throw checked exception
                            
                                Eclipse-RCP Wrong encoding when deploying the product
                            
                                Java-Mode Argument Indenting in Emacs
                            
                                PPP Server on Windows
                            
                                Using MessagePack with Android
                            
                                Cycles in chained exceptions
                            
                                Why does 64 bit JVM throw Out Of Memory before xmx is reached?
                            
                                Using .getclass() on an interface
                            
                                IntelliJ Idea under Linux, No such file or directory on main class
                            
                                Should I close the FileChannel?
                            
                                Simplified Bresenham's line algorithm: What does it *exactly* do?
                            
                                How to programmatically generate .class files?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With