Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of stackmap frames and how it helps in byte code verification?

I have been trying to get my head around the obscure stack map frame and it's role in making the verification of a dynamically loaded class in just a single pass.

Few stack overflow answers & other resources which I found immensely helpful are

  1. Is there a better explanation of stack map frames?

  2. What kind of Java code requires stackmap frames?

  3. http://chrononsystems.com/blog/java-7-design-flaw-leads-to-huge-backward-step-for-the-jvm

I understand the following -

  1. Every basic block should start with a stack map frame.
  2. Every instruction immediately following an unconditional branch (it's a start of a basic block) should have a stack map frame.
  3. The algorithm to create the stack map frame by ASM. Section 3.5 of ASM doc

The shortcoming of all these articles is that it doesn't describe how exactly the stack map frame is used in verification.

More specifically - Let's say we have a bytecode like mentioned below. At the current location, operand stack is going to be empty and the type for local variable 1 is going to be B. The location L0 has a stack map frame associated. How does the verifier use this information?

    <initial frame for method>
    GETSTATIC B.VALUE
    ASTORE 1 
    GOTO L0 <- Current location
    <stack map frame>
L1  GETSTATIC A.VALUE
    ASTORE 1
    <stack map frame>
L0  ILOAD 0
    IFNE L1
    <stack map frame>
    ALOAD 1
    ARETURN

Note: Note that I did read through the JVM spec and failed miserably to understand the stack map frame. Any help would be very helpful.

like image 349
KodeWarrior Avatar asked Mar 09 '17 23:03

KodeWarrior


People also ask

What is a StackMap?

Learn how to locate items in your library using StackMap with WorldCat Discovery. StackMap is an indoor mapping solution and way-finding tool designed to help users physically locate items and explore spaces.

Why do we need bytecode?

If you have to compile the code for a given processor architecture you would have speed but not portability. With the bytecode, you compile the code (into bytecode) for a common machine that will execute it (the JVM) it is a compromise between speed and portability.


Video Answer


1 Answers

At every point in the bytecode, every item in the locals and operand stack has an implicit type. Under the old system, the verifier calculated these types as it went, but in the event that control flow goes backwards, that could change the types at the target, meaning that it had to iterate until convergence.

Now, the types are explicitly specified at such jump targets. The verifier makes a single, linear pass through the bytecode. Whenever it gets to a stackframe, it asserts that the currently inferred types are compatible with the explicit types in the stack frame, and then it continues, using the stack frame types. Whenever it gets to a jump, it asserts that the stack frame at the target of the jump has types that are compatible with the currently inferred types.

Essentially, the stack frames explicitly store the results of "iterating to convergence" meaning that instead of calculating them, the verifier just checks that the results are correct, which can be done in a single pass.

Apart from that, newer classfiles aren't allowed to use the jsr and ret instructions, which makes verification much, much, much easier.

As a specific example, suppose you have code like the following

.method static foo : ()V
L0: aconst_null
L1: astore_0
L2: new Foo
L3: dup
L4: invokespecial Method Foo <init> ()V
L5: astore_0
L6: goto L2
.end method

Under inference verification, the verfier would initially infer the type of var 0 to be NULL at L2. Once it reaches L6, it has to go back and change the type to Foo.

Under stack map verification, the verifier will once again initially infer the type of var 0 to be NULL at L2. However, it sees that there is a stack frame at L2 and checks what the type of 0 is in the stack frame. Whatever it is, it sets 0 to that type and continues checking. When it gets to L6, it looks at the stack frame of the target of the jump (L2), and asserts that the type of 0 at L6 (which is Foo) is assignable to the type of 0 at L2 (specified in the stack frame of L2).

Suppose that the stack frame at L2 declares that 0 has type Object. Then the stackmap verifier infers the following types at each step

L0: INVALID (unset)
L1: INVALID (unset)
L2: NULL
(checks stack frame at L2)
(assert that NULL is assignable to Object)
L2: Object
L3: Object
L4: Object
L5: Object
L6: Foo
(check stack frame at L2)
(assert that Foo is assignable to Object)
like image 132
Antimony Avatar answered Oct 10 '22 22:10

Antimony