Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect recursive method calls at run time in byte code using ASM (5.x): howto?

The problem is as follows; the method, in Java code, is:

Rule foo()
{
    return sequence(foo(), x());
}

This will provoke a parsing loop which of course should be avoided; however, this is legal:

Rule foo()
{
    return sequence(x(), foo());
}

Now, somewhere else in the code I do have access to a RuleMethod, which is a class extending MethodNode, and therefore I do have access to the following information:

  • ruleMethod.name: foo; (defined in MethodNode)
  • ruleMethod.desc: ()Lorg/parboiled/Rule; (defined in MethodNode)
  • ruleMethod.ownerClass: com.github.fge.grappa.experiments.SelfReferringRule.MyParser (defined in RuleMethod

And the bytecode of the first code extract above is as follows:

Method 'foo':
 0    L0
 1     ALOAD 0
 2     ALOAD 0
 3     INVOKEVIRTUAL com/github/fge/grappa/experiments/SelfReferringRule$MyParser.foo ()Lorg/parboiled/Rule;
 4     ALOAD 0
 5     INVOKEVIRTUAL com/github/fge/grappa/experiments/SelfReferringRule$MyParser.x ()Lorg/parboiled/Rule;
 6     ICONST_0
 7     ANEWARRAY java/lang/Object
 8     INVOKEVIRTUAL com/github/fge/grappa/experiments/SelfReferringRule$MyParser.sequence (Ljava/lang/Object;Ljava/lang/Object;[Ljava/lang/Object;)Lorg/parboiled/Rule;
 9     ARETURN
10    L1

Which means I have each and every information available to me to be able to spot, at least in the bytecode above, that foo() is the first argument of the sequence() invocation, since the constructor accepts three arguments and there are three elements on the stack.

But of course I can't "eye inspect" at runtime. Therefore I need a way to do this...

It looks like what I need is a MethodVisitor and somewhat visitInsn(), then see what arguments there are and detect appropriately...

But I don't have the slightest idea where to start; searching around on the net seems to only give examples of how to modify byte code, not detect such situations :/

Where do I start?

like image 246
fge Avatar asked Oct 31 '22 10:10

fge


1 Answers

Analysis is generally much easier using the tree api as it allows you to easily backtrack and provides support for flow analysis.

If I understand your problem correctly, all you need to do (if all you wish to support is simple cases such as your example) is scan backwards from the call to sequence. As you know the code compiles what's on the stack must be valid, so just count back three method calls / field gets / etc.

If you want to support more complex scenarios where the inputs are assigned to variables by branch statements you will need some sort of flow analysis.

like image 176
henry Avatar answered Nov 13 '22 15:11

henry