Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do any compilers for the JVM use the "wide" goto?

Tags:

java

jvm

goto

I figure most of you know that goto is a reserved keyword in the Java language but is not actually used. And you probably also know that goto is a Java Virtual Machine (JVM) opcode. I reckon all the sophisticated control flow structures of Java, Scala and Kotlin are, at the JVM level, implemented using some combination of goto and ifeq, ifle, iflt, etc.

Looking at the JVM spec https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.goto_w I see there's also a goto_w opcode. Whereas goto takes a 2-byte branch offset, goto_w takes a 4-byte branch offset. The spec states that

Although the goto_w instruction takes a 4-byte branch offset, other factors limit the size of a method to 65535 bytes (§4.11). This limit may be raised in a future release of the Java Virtual Machine.

It sounds to me like goto_w is future-proofing, like some of the other *_w opcodes. But it also occurs to me that maybe goto_w could be used with the two more significant bytes zeroed out and the two less significant bytes the same as for goto, with adjustments as needed.

For example, given this Java Switch-Case (or Scala Match-Case):

     12: lookupswitch  {                 112785: 48 // case "red"                3027034: 76 // case "green"               98619139: 62 // case "blue"                default: 87           }       48: aload_2       49: ldc           #17                 // String red       51: invokevirtual #18             // Method java/lang/String.equals:(Ljava/lang/Object;)Z       54: ifeq          87       57: iconst_0       58: istore_3       59: goto          87       62: aload_2       63: ldc           #19                 // String green       65: invokevirtual #18             // Method java/lang/String.equals:(Ljava/lang/Object;)Z       68: ifeq          87       71: iconst_1       72: istore_3       73: goto          87       76: aload_2       77: ldc           #20                 // String blue       79: invokevirtual #18        // etc. 

we could rewrite it as

     12: lookupswitch  {                  112785: 48                3027034: 78               98619139: 64                default: 91           }       48: aload_2       49: ldc           #17                 // String red       51: invokevirtual #18             // Method java/lang/String.equals:(Ljava/lang/Object;)Z       54: ifeq          91 // 00 5B       57: iconst_0       58: istore_3       59: goto_w        91 // 00 00 00 5B       64: aload_2       65: ldc           #19                 // String green       67: invokevirtual #18             // Method java/lang/String.equals:(Ljava/lang/Object;)Z       70: ifeq          91       73: iconst_1       74: istore_3       75: goto_w          91       79: aload_2       81: ldc           #20                 // String blue       83: invokevirtual #18        // etc. 

I haven't actually tried this, since I've probably made a mistake changing the "line numbers" to accommodate the goto_ws. But since it's in the spec, it should be possible to do it.

My question is whether there is a reason a compiler or other generator of bytecode might use goto_w with the current 65535 limit other than to show that it can be done?

like image 391
Alonso del Arte Avatar asked Apr 17 '20 22:04

Alonso del Arte


People also ask

Does JVM contain compiler?

JVM have both compiler and interpreter. Because the compiler compiles the code and generates bytecode. After that the interpreter converts bytecode to machine understandable code.

What are the 3 components of JVM?

The JVM consists of three distinct components: Class Loader. Runtime Memory/Data Area. Execution Engine.

Is JVM compile Java program?

JVM does not compile the code, it interprets. Java is both an interpreted and compiled language. The Java compiler ,'Javac' produces byte-code which is platform-independent. This byte-code is, we can say generic, ie., it does not include machine level details, which are specific to each platform.

What is meant by JVMS?

What is Java Virtual Machine (JVM)? Java Virtual Machine, or JVM, loads, verifies and executes Java bytecode. It is known as the interpreter or the core of Java programming language because it executes Java programming.


Video Answer


1 Answers

The size of the method code can be as large as 64K.

The branch offset of the short goto is a signed 16-bit integer: from -32768 to 32767.

So, the short offset is not enough to make a jump from the beginning of 65K method to the end.

Even javac sometimes emits goto_w. Here is an example:

public class WideGoto {      public static void main(String[] args) {         for (int i = 0; i < 1_000_000_000; ) {             i += 123456;             // ... repeat 10K times ...         }     } } 

Decompiling with javap -c:

  public static void main(java.lang.String[]);     Code:        0: iconst_0        1: istore_1        2: iload_1        3: ldc           #2        5: if_icmplt     13        8: goto_w        50018     // <<< Here it is! A jump to the end of the loop           ... 
like image 157
apangin Avatar answered Oct 02 '22 13:10

apangin