My question is the following:
Its usual for Java code to have generic collections implemented like:
public class GenericCollection<T> {
private Object[] data;
public GenericCollection () {
// Backing array is a plain object array.
this.data = new Object[10];
}
@SuppressWarnings( "unchecked" )
public T get(int index) {
// And we just cast to appropriate type when needed.
return (T) this.data[index];
}
}
And used like this for example:
for (MyObject obj : genericCollection) {
obj.myObjectMethod();
}
Since the generic type of genericCollection is erased, the JVM doesn't seems to have a way to know that really inside 'data' array of genericCollection there are only MyObject instances, since the actual type of the array is Object, there could be a String in it, and calling 'myObjectMethod' on it would raise an exception.
So I'm assuming the JVM has to do some runtime checking gymnastics to know what really is inside that GenericCollection instance.
Now look at this implementation:
public class GenericCollection<T> {
private T[] data;
@SuppressWarnings( "unchecked" )
public GenericCollection ( Class<T> type ) {
// Create a type specific array.
this.data = (T[]) Array.newInstance( type, 10 );
}
public T get ( int index ) {
// No unsafe casts needed.
return this.data[index];
}
}
In this case we create a type specific array through reflection, so the JVM could infer there could be only be T objects inside that array in a given context, making the unsafe casts and possible expensive type checks redundant.
My question would be, given the things HotSpot can do, would it help in any way, performance-wise, to implement generic collections with a "proper" type specific backing array?
For example, does it helps HotSpot in removing unnecessary type checks or casts? Maybe possibly enabling it to more easily inline methods given it knows the backing array is of a specific type?
Not in this particular case.
Generic array T[]
is erased to Object[]
in the bytecode. The array getter for Object[]
always returns Object
, so it does not need check the actual type of array. Hence there is no benefit in having T[]
instead of Object[]
for array get operation. In both cases there is aaload
instruction followed by checkcast
, and it works the same way.
Meanwhile array setter will perform worse for typed array rather than Object[]
, because aastore
must check that the value matches the actual array component type.
That is, your proposed modification works equally for get
, but performs worse for set
. This can be confirmed by the following JMH benchmark.
package bench;
import org.openjdk.jmh.annotations.*;
import java.lang.reflect.Array;
@State(Scope.Benchmark)
public class Generics {
private ObjectArray<String> objectArray;
private GenericArray<String> genericArray;
private StringArray stringArray;
private int index;
@Param("100000")
private int length;
@Setup
public void setup() {
genericArray = new GenericArray<>(String.class, length);
objectArray = new ObjectArray<>(length);
stringArray = new StringArray(length);
for (int i = 0; i < length; i++) {
String s = Integer.toString(i);
objectArray.set(i, s);
genericArray.set(i, s);
stringArray.set(i, s);
}
}
@Benchmark
public String getGenericArray() {
return genericArray.get(nextIndex());
}
@Benchmark
public String getObjectArray() {
return objectArray.get(nextIndex());
}
@Benchmark
public String getStringArray() {
return stringArray.get(nextIndex());
}
@Benchmark
public void setGenericArray() {
genericArray.set(nextIndex(), "value");
}
@Benchmark
public void setObjectArray() {
objectArray.set(nextIndex(), "value");
}
@Benchmark
public void setStringArray() {
stringArray.set(nextIndex(), "value");
}
private int nextIndex() {
if (++index == length) index = 0;
return index;
}
static class GenericArray<T> {
private T[] data;
@SuppressWarnings("unchecked")
public GenericArray(Class<T> type, int length) {
this.data = (T[]) Array.newInstance(type, length);
}
public T get(int index) {
return data[index];
}
public void set(int index, T value) {
data[index] = value;
}
}
static class ObjectArray<T> {
private Object[] data;
public ObjectArray(int length) {
this.data = new Object[length];
}
@SuppressWarnings("unchecked")
public T get(int index) {
return (T) data[index];
}
public void set(int index, T value) {
data[index] = value;
}
}
static class StringArray {
private String[] data;
public StringArray(int length) {
this.data = new String[length];
}
public String get(int index) {
return data[index];
}
public void set(int index, String value) {
data[index] = value;
}
}
}
And the results:
Benchmark (length) Mode Cnt Score Error Units
Generics.getGenericArray 100000 avgt 40 5,212 ± 0,038 ns/op <- equal
Generics.getObjectArray 100000 avgt 40 5,224 ± 0,043 ns/op <-
Generics.getStringArray 100000 avgt 40 4,557 ± 0,051 ns/op
Generics.setGenericArray 100000 avgt 40 3,299 ± 0,032 ns/op <- worse
Generics.setObjectArray 100000 avgt 40 2,456 ± 0,007 ns/op <-
Generics.setStringArray 100000 avgt 40 2,138 ± 0,008 ns/op
No. The type erasure Java Tutorial explains
Generics were introduced to the Java language to provide tighter type checks at compile time and to support generic programming. To implement generics, the Java compiler applies type erasure to:
- Replace all type parameters in generic types with their bounds or Object if the type parameters are unbounded. The produced bytecode, therefore, contains only ordinary classes, interfaces, and methods.
- Insert type casts if necessary to preserve type safety.
- Generate bridge methods to preserve polymorphism in extended generic types.
Thus, after compilation, the generic types are Object
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With