Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ClassWriter COMPUTE_FRAMES in ASM

I've been trying to understand how stack map frames work in Java by playing around with jumps in ASM. I created a simple method to try some things out: (disassembled with Krakatau):

    L0:     ldc 'hello' 
    L2:     astore_1 
    L3:     getstatic Field java/lang/System out Ljava/io/PrintStream; 
    L6:     new java/lang/StringBuilder 
    L9:     dup 
    L10:    invokespecial Method java/lang/StringBuilder <init> ()V 
    L13:    ldc 'concat1' 
    L15:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L18:    aload_1 
    L19:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L22:    invokevirtual Method java/lang/StringBuilder toString ()Ljava/lang/String; 
    L25:    invokevirtual Method java/io/PrintStream println (Ljava/lang/String;)V 
    L28:    getstatic Field java/lang/System out Ljava/io/PrintStream; 
    L31:    new java/lang/StringBuilder 
    L34:    dup 
    L35:    invokespecial Method java/lang/StringBuilder <init> ()V 
    L38:    ldc 'concat2' 
    L40:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L43:    aload_1 
    L44:    invokevirtual Method java/lang/StringBuilder append (Ljava/lang/String;)Ljava/lang/StringBuilder; 
    L47:    invokevirtual Method java/lang/StringBuilder toString ()Ljava/lang/String; 
    L50:    invokevirtual Method java/io/PrintStream println (Ljava/lang/String;)V 
    L53:    return 

All it does is create a StringBuilder to join some strings with variables.

Since the invokespecial call at L35 has exactly the same stack as the invokespecial call at L10, I decided to add an ICONST_1; IFEQ L10 sequence just before L35 with ASM.

When I dissassembled (again with Krakatau), I found the results quite strange. ASM had computed the stack frame at L10 to be:

.stack full
    locals Object [Ljava/lang/String; Object java/lang/String 
    stack Object java/io/PrintStream Top Top 
.end stack

instead of

    stack Object java/io/PrintStream Object java/lang/StringBuilder Object java/lang/StringBuilder

as I had expected.

Furthermore, this class would also not pass verification as one cannot call StringBuilder#<init> on Top. According to the ASM manual, Top refers to an uninitialized value, but it doesn't seem to be uninitialized in code, both from the jump location and the code before. I don't understand what is wrong with the jump.

Is there something wrong with the jump I inserted that somehow makes the class impossible to compute frames for? Is this perhaps a bug with ASM's ClassWriter?

like image 970
konsolas Avatar asked Mar 11 '23 09:03

konsolas


1 Answers

Uninitialized instances are special. Consider that, when you dup the reference, you have already two references to the same instance on the stack and you might perform even more stack manipulations or transfer the reference to a local variable and from there, copy it to other variables or push it again. Still, the target of the reference is supposed to be initialized exactly once before you use it in any way. To verify this, the identity of the object must be tracked, so that all these references to the same object will turn from uninitialized to initialized when you perform an invokespecial <init> on it.

The Java programming language doesn’t use all the possibilities, but for legal code like
new Foo(new Foo(new Foo(), new Foo(b? new Foo(a): new Foo(b, c))), it should not loose track about which Foo instance has been initialized and which not, when the branch is made.

So each Uninitialized Instance stack frame entry is tied to the new instruction that created it. All entries keep the reference (which can be handled as easy as remembering the byte code offset of the new instruction) when being transferred or copied. Only after invokespecial <init> has been invoked on it, all references pointing to the same new instruction turn to an ordinary instance of the declaring class and can be subsequently merged with other type compatible entries.

This implies that a branch, like you are trying to achieve, is not possible. The two Uninitialized Instance entries of the same type, but created by different new instructions, are incompatible. And incompatible types are merged to a Top entry, which is basically an unusable entry. It could be even correct code, if you don’t attempt to use that entry at the branch target, so ASM is not doing anything wrong when merging them to Top without complaining.

Note that this also implies that any kind of loop that could lead to a stack frame having more than one uninitialized instance created by the same new instruction, is not allowed.

like image 69
Holger Avatar answered Mar 28 '23 12:03

Holger