I've recently come across while developing a library that performs operations on JVM bytecode some opcodes on which there is no documentation (that I've found), yet which are recognized by the JVM reference implementation. I've found a list of these, and they are:
BREAKPOINT = 202;
LDC_QUICK = 203;
LDC_W_QUICK = 204;
LDC2_W_QUICK = 205;
GETFIELD_QUICK = 206;
PUTFIELD_QUICK = 207;
GETFIELD2_QUICK = 208;
PUTFIELD2_QUICK = 209;
GETSTATIC_QUICK = 210;
PUTSTATIC_QUICK = 211;
GETSTATIC2_QUICK = 212;
PUTSTATIC2_QUICK = 213;
INVOKEVIRTUAL_QUICK = 214;
INVOKENONVIRTUAL_QUICK = 215;
INVOKESUPER_QUICK = 216;
INVOKESTATIC_QUICK = 217;
INVOKEINTERFACE_QUICK = 218;
INVOKEVIRTUALOBJECT_QUICK = 219;
NEW_QUICK = 221;
ANEWARRAY_QUICK = 222;
MULTIANEWARRAY_QUICK = 223;
CHECKCAST_QUICK = 224;
INSTANCEOF_QUICK = 225;
INVOKEVIRTUAL_QUICK_W = 226;
GETFIELD_QUICK_W = 227;
PUTFIELD_QUICK_W = 228;
IMPDEP1 = 254;
IMPDEP2 = 255;
They seem to be replacements for their other implementations, yet have different opcodes. After a long period of trawling page after page through Google, I came across a mention of the LDC*_QUICK
opcodes in this document.
Quote from it on the LDC_QUICK
opcode:
Operation Push item from constant pool
Forms ldc_quick = 203 (0xcb)
Stack ... ..., item
Description The index is an unsigned byte that must be a valid index into the constant pool of the current class (§3.6). The constant pool item at index must have already been resolved and must be one word wide. The item is fetched from the constant pool and pushed onto the operand stack.
Notes The opcode of this instruction was originally ldc. The operand of the ldc instruction is not modified.
Alright. Seemed interesting, and so I decided to try it out. LDC_QUICK
seems to have the same format as LDC
, so I proceeded into changing a LDC
opcode to a LDC_QUICK
one. This resulted in a failure, though the JVM obviously recognized it. After attempting to run the modified file, the JVM crashed with the following output:
Exception in thread "main" java.lang.VerifyError: Bad instruction: cc
Exception Details:
Location:
Test.main([Ljava/lang/String;)V @9: fast_bgetfield
Reason:
Error exists in the bytecode
Bytecode:
0000000: bb00 0559 b700 064c 2bcc 07b6 0008 572b
0000010: b200 09b6 000a 5710 0ab8 000b 08b8 000c
0000020: 8860 aa00 0000 0032 0000 0001 0000 0003
0000030: 0000 001a 0000 0022 0000 002a b200 0d12
0000040: 0eb6 000f b200 0d12 10b6 000f b200 0d12
0000050: 11b6 000f bb00 1259 2bb6 0013 b700 14b8
0000060: 0015 a700 104d 2cb6 0016 b200 0d12 17b6
0000070: 000f b1
Exception Handler Table:
bci [84, 98] => handler: 101
Stackmap Table:
append_frame(@60,Object[#41])
same_frame(@68)
same_frame(@76)
same_frame(@84)
same_locals_1_stack_item_frame(@101,Object[#42])
same_frame(@114)
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
at java.lang.Class.getMethod0(Unknown Source)
at java.lang.Class.getMethod(Unknown Source)
at sun.launcher.LauncherHelper.validateMainClass(Unknown Source)
at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)
The above error gives mixed messages. Obviously, class file verification failed: java.lang.VerifyError: Bad instruction: cc
. At the same time, the JVM recognized the opcode: @9: fast_bgetfield
. Additionally, it seems to think that it is a different instruction, because fast_bgetfield
does not imply constant pushing...
I think its fair to say I am quite confused. What are these illegal opcodes? Do JVM's run them? Why am I receiving VerifyError
s? Deprecation? And do they have an advantage over their documented counterparts?
Any insight would be greatly appreciated.
Java bytecode is the instruction set of the Java virtual machine. Each bytecode is composed of one, or in some cases two bytes that represent the instruction (opcode), along with zero or more bytes for passing parameters. Currently, in Jan 2017, there are 205 opcodes in use out of 256 possible byte-long opcodes.
A Java Virtual Machine instruction consists of an opcode specifying the operation to be performed, followed by zero or more operands embodying values to be operated upon. This chapter gives details about the format of each Java Virtual Machine instruction and the operation it performs.
Currently, the Java Virtual Machine is constrained to a single byte of instructions. This leaves a mere 256 opcodes for implementations to work with. Of these, approximately 202 are already in use and 3 are reserved for special use cases, leaving just 55 remaining for use in the future.
The JVM has a program counter and three registers that manage the stack. It has few registers because the bytecode instructions of the JVM operate primarily on the stack. This stack-oriented design helps keep the JVM's instruction set and implementation small.
The first edition of the Java virtual machine specification described a technique used by one of Sun's early implementations of the Java virtual machine to speed up the interpretation of bytecodes. In this scheme, opcodes that refer to constant pool entries are replaced by a "_quick" opcode when the constant pool entry is resolved. When the virtual machine encounters a _quick instruction, it knows the constant pool entry is already resolved and can therefore execute the instruction faster.
The core instruction set of the Java virtual machine consists of 200 single-byte opcodes. These 200 opcodes are the only opcodes you will ever see in class files. Virtual machine implementations that use the "_quick" technique use another 25 single-byte opcodes internally, the "_quick" opcodes.
For example, when a virtual machine that uses the _quick technique resolves a constant pool entry referred to by an ldc instruction (opcode value 0x12), it replaces the ldc opcode byte in the bytecode stream with an ldc_quick instruction (opcode value 0xcb). This technique is part of the process of replacing a symbolic reference with a direct reference in Sun's early virtual machine.
For some instructions, in addition to overwriting the normal opcode with a _quick opcode, a virtual machine that uses the _quick technique overwrites the operands of the instruction with data that represents the direct reference. For example, in addition to replacing an invokevirtual opcode with an invokevirtual_quick, the virtual machine also puts the method table offset and the number of arguments into the two operand bytes that follow every invokevirtual instruction. Placing the method table offset in the bytecode stream following the invokevirtual_quick opcode saves the virtual machine the time it would take to look up the offset in the resolved constant pool entry.
Chapter 8 of Inside the Java Virtual Machine
Basically you cannot just put the opcode in the class file. Only the JVM can do that after it resolves the operands.
I don't know about all of the opcodes you have listed, but three of them—breakpoint, impdep1, and impdep2—are reserved opcodes documented in Section 6.2 of the Java Virtual Machine Specification. It says, in part:
Two of the reserved opcodes, numbers 254 (0xfe) and 255 (0xff), have the mnemonics impdep1 and impdep2, respectively. These instructions are intended to provide "back doors" or traps to implementation-specific functionality implemented in software and hardware, respectively. The third reserved opcode, number 202 (0xca), has the mnemonic breakpoint and is intended to be used by debuggers to implement breakpoints.
Although these opcodes have been reserved, they may be used only inside a Java virtual machine implementation. They cannot appear in valid class files. . . .
I suspect (from their names) that the rest of the other opcodes are part of the JIT mechanism and also cannot appear in a valid class file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With