I was wondering if there's an obvious and quick way of when analyzing a constructor's bytecode, to determine where the super()
code ends in.
More concretely, and in sharp contrast to Java, where a call in the constructor to any super()
constructor method is optional (or rather, when not present -- implicit), in the bytecode world it is always needed.
For black magic purposes I'm in need of knowing just by bytecode analysis and by the simplest method available, what's the INVOKESPECIAL
call that corresponds to the Java world's super()
call.
I'll leave you here with a hard example:
public static class A {
public A(Object o, Object b) {
}
}
public static class B extends A {
public B() {
//the below super is in bold just to signal that's the one
//I'm looking for
SUPER(new A(new Object(), new Integer(2)), new Integer(1));
System.out.println(new A(new Object(), new Integer(2)));
}
}
with the corresponding bytecode:
It is required if the parameterized constructor (a constructor that takes arguments) of the superclass has to be called from the subclass constructor. The parameterized super() must always be the first statement in the body of the constructor of the subclass, otherwise, we get a compilation error.
Call to super() must be the first statement in the Derived(Student) Class constructor because if you think about it, it makes sense that the superclass has no knowledge of any subclass, so any initialization it needs to perform is separate from and possibly prerequisite to any initialization performed by the subclass.
It is up to you whether you use super() or this() or not because if we are not using this() or super() then by default compiler will put super() as the first statement inside the constructor.
“this()” and “super()” cannot be used inside the same constructor, as both cannot be executed at once (both cannot be the first statement). “this” can be passed as an argument in the method and constructor calls.
Actually, the rules for bytecode constructors are much more lax than Java's rules.
The only rule is that exactly one constructor must be called on any path that returns normally and if a constructor call throws an exception, then you must throw an exception too.
Among other things, this means that a constructor may contain multiple calls to other constructors or none at all.
Anyway, the only guaranteed way to determine whether a given invokespecial
call is initializing the current object is to do a dataflow analysis, since it's possible to initialize other objects of the same class, which would confuse a naive detector.
Edit: Here is an example of a perfectly valid class (using the Krakatau assembler syntax), showing some of the issues you could run into. Among other things, it has calls to other constructors in the same class, recursive invocation of constructors, and constructing other objects of the same class inside the constructor.
.class public ctors
.super java/lang/Object
; A normal constructor
.method public <init> : ()V
.limit locals 1
.limit stack 1
aload_0
invokespecial java/lang/Object <init> ()V
return
.end method
; A weird constructor
.method public <init> : (I)V
.limit locals 2
.limit stack 5
iload_1
ifne LREST
aload_0
invokespecial ctors <init> ()V
return
LREST:
aload_0
new ctors
iinc 1 -1
iload_1
LFAKE_START:
invokespecial ctors <init> (I)V
LFAKE_END:
iconst_0
invokespecial ctors <init> (I)V
return
.catch [0] from LFAKE_START to LFAKE_END using LCATCH
LCATCH:
aload_0
invokespecial java/lang/Object <init> ()V
return
.end method
.method public static main : ([Ljava/lang/String;)V
.limit locals 1
.limit stack 2
new ctors
iconst_5
invokespecial ctors <init> (I)V
return
.end method
A simple solution is to count the number of new A
object and the number of A.<init>
When there is more init
than new
you have called the super constructor. You have to do the same check for new B
and B.<init>
in case this(...)
is called.
You have to find out at which invoke opcode the operand stack contains the this
reference which will be used as the first argument. For this you just need to know about the effects on the operand stack that the different opcodes have. In your example you start with aload_0
(which is the this
reference), then do quite a bit of magic above that reference (updating the operand stack all the time). After a while the invoke opcode you are looking for is there, which consumes the this
reference (and some references for the arguments). This then is the super
call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With