Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Types in Bytecode

I've been working for some time on (Java) Bytecode, however, it had never occurred to me to ask why are some instructions typed? I understand that in an ADD operation, we need to distinguish between an integer addition and a FP addition (that's why we have IADD and FADD). However, why do we need to distinguish between ISTORE and FSTORE? They both involve the exact same operation, which is moving 32 bits from the stack to a local variable position?

The only answer I can think of is for type-safety, to prevent this: (ILOAD, ILOAD, FADD). However, I believe that type-safety is already enforced at the Java language level. OK, the Class file format is not directly coupled with Java, so is this a way to enforce type-safety for languages that do not support it? Any thought? Thank you.

EDIT: to follow up on Reedy's answer. I wrote this minimal program:

public static void main(String args[])
{
    int x = 1;
}

which compiled to:

iconst_1
istore_1
return

using a bytecode editor, I changed the second instruction:

iconst_1
fstore_1
return

and it returned a java.lang.VerifyError: Expecting to find float on stack.

I wonder, if on the stack there's no information on the type, just bits, how did the FSTORE instruction knew that it was dealing with a int and not a float?

Note: I couldn't find a better title for this question. Feel free to improve it.

like image 315
H-H Avatar asked Apr 14 '10 13:04

H-H


People also ask

What is the form of bytecode?

The bytecode itself is in a binary format that consists of constants, references and numeric codes. The Java virtual machine interprets bytecode and converts it to machine language that is platform-specific.

What is byte code example?

Byte code is an intermediate code between the source code and machine code. It is a low-level code that is the result of the compilation of a source code which is written in a high-level language. It is processed by a virtual machine like Java Virtual Machine (JVM).

What are bytecode languages?

Byte-code languages are those that rely on a virtual machine to execute the user’s program, but instead of the user program being compiled into native computer instructions, it is converted into bytes that the virtual machine understands.

Which is byte code in Java?

What Is the Bytecode? Bytecode is the intermediate representation of a Java program, allowing a JVM to translate a program into machine-level assembly instructions. When a Java program is compiled, bytecode is generated in the form of a . class file.


2 Answers

To answer your first question with my best guess: these bytecodes are different because they may require different implementations. For example, a particular architecture may keep integer operands on the main stack, but floating-point operands in hardware registers.

To answer your second question, VerifyError is thrown when the class is loaded, not when it's executed. The verification process is described here; note pass #3.

like image 148
Anon Avatar answered Sep 17 '22 12:09

Anon


These instructions are typed to ensure the program is typesafe. When loading a class the virtual machine performs verification on the bytecodes to ensure that, for example, a float isn't passed as an argument to a method expecting an integer. This static verification requires that the verifier can determine the types and number of values on the stack for any given execution path. The load and store instructions need the type tag because the local variables in the stack frames are not typed (i.e. you can istore to a local variable and later fstore to the same position). The type tags on the instructions allow the verifier to know what type of value is stored in each local variable.

The verifier looks at each opcode in the method and keeps track of what types will be on the stack and in the local variables after executing each one. You are right that this is another form of type checking and does duplicate some of the checks done by the java compiler. The verification step prevents loading of any code that would cause the VM to execute an illegal instruction and ensures the safety properties of the Java platform without incurring the large runtime penalty of checking types before each operation. Runtime type checking for each opcode would be a performance hit each time the method is executed, but the static verification is done only once when the class is loaded.

Case 1:

Instruction             Verification    Stack Types            Local Variable Types 
----------------------- --------------- ---------------------- ----------------------- 
<method entry>          OK              []                     1: none
iconst_1                OK              [int]                  1: none
istore_1                OK              []                     1: int
return                  OK              []                     1: int

Case 2:

Instruction             Verification    Stack Types            Local Variable Types 
----------------------- --------------- ---------------------- ----------------------- 
<method entry>          OK              []                     1: none
iconst_1                OK              [int]                  1: none
fstore_1                Error: Expecting to find float on stack

The error is given because the verifier knows that fstore_1 expects a float on the stack but the result of executing the previous instructions leaves an int on the stack.

This verification is done without executing the opcodes, rather it is done by looking at the types of the instruction, just like the java compiler gives an error when you write (Integer)"abcd". The compiler doesn't have to run the program to know that "abcd" is a string and can't be cast to Integer.

like image 20
Geoff Reedy Avatar answered Sep 16 '22 12:09

Geoff Reedy