Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why not remove type erasure from the next JVM?

Java introduced type erasure with generics in Java 5 so they would work on old versions of Java. It was a tradeoff for compatibility. We've since lost that compatibility[1] [2] [3]--bytecode can be run on later versions of the JVM but not earlier ones. This looks like the worse possible choice: we've lost type information and we still can't run bytecode compiled for newer versions of the JVM on older versions. What happened?

Specifically I'm asking if there are any technical reasons why type erasure couldn't be removed in the next version of the JVM (assuming, like previous releases, its bytecode won't be able to run on the last version anyway).

[3]: Type erasure could be backported in a manner similar to retrolambda for those who really like it.

Edit: I think the discussion of the definition of backwards vs. forwards compatibility is obscuring the question.

like image 755
Prime Avatar asked Jun 27 '16 13:06

Prime


People also ask

Why is type erasure necessary?

Type erasure ensures that no new classes are created for parameterized types; consequently, generics incur no runtime overhead.

What is erasure Why is erasure important in Java generics implementation?

Type erasure is a process in which compiler replaces a generic parameter with actual class or bridge method. In type erasure, compiler ensures that no extra classes are created and there is no runtime overhead.

What are the two 2 types of Erasure?

- Erasure is a type of alteration in document. It can be classified as chemical erasure and physical erasure. - There are many chemicals which are able to invisible the ink such as oxalic acid, sodium hydrochlorite etc.

What is type erasure and when would you use it?

Type-erasure simply means "erasing" a specific type to a more abstract type in order to do something with the abstract type (like having an array of that abstract type).


1 Answers

Type erasure is more than just a byte code feature that you can turn on or off.

It affects the way the entire runtime environment works. If you want to be able to query the generic type of every instance of a generic class, it implies that meta information, comparable to a runtime Class representation, is created for each object instantiation of a generic class.

If you write new ArrayList<String>(); new ArrayList<Number>(); new ArrayList<Object>() you are not only creating three objects, you are potentially creating three additional meta objects reflecting the types, ArrayList<String>, ArrayList<Number>, and ArrayList<Object>, if they didn’t exist before.

Consider that there are thousand of different List signatures in use in a typical application, most of them never used in a place where the availability of such Reflection is required (due to the absence of this feature, we could conclude that currently, all of them work without such a Reflection).

This, of course, multiplies, thousand different generic list types imply thousand different generic iterator types, thousand spliterator and Stream incarnations, not even counting the internal classes of the implementation.

And it even affects places without an object allocation which are currently exploting the type erasure under the hood, e.g. Collections.emptyList(), Function.identity() or Comparator.naturalOrder(), etc. return the same instance each time they are invoked. If you insist on having the particalar captured generic type reflectively inspectable, this won’t work anymore. So if you write

List<String> list=Collections.emptyList(); List<Number> list=Collections.emptyList(); 

you would have to receive two distinct instances, each of them reporting a different on getClass() or the future equivalent.


It seems, people wishing for this ability have a narrow view on their particular method, where it would be great if they could reflectively find out whether one particular parameter is actually one out of two or three types, but never think about the weight of carrying meta information about potentially hundreds or thousands generic instantiations of thousands of generic classes.

This is the place where we have to ask what we gain in return: the ability to support a questionable coding style (this is what altering the code’s behavior due to information found via Reflection is all about).


The answer so far only addressed the easy aspect of removing type erasure, the desire the introspect the type of an actual instance. An actual instance has a concrete type, which could be reported. As mentioned in this comment from the user the8472, the demand for removal of type erasure often also implies the wish for being able to cast to (T) or create an array via new T[] or access the type of a type variable via T.class.

This would raise the true nightmare. A type variable is a different beast than the actual type of a concrete instance. A type variable could resolve to a, e.g. ? extends Comparator<? super Number> to name one (rather simple) example. Providing the necessary meta information would imply that not only object allocation becomes much more expensive, every single method invocation could impose these additional cost, to an even bigger extend as we are now not only talking about the combination of generic classes with actual classes, but also every possible wildcarded combination, even of nested generic types.

Keep in mind that the actual type of a type parameter could also refer to other type parameters, turning the type checking into a very complex process, which you not only have to repeat for every type cast, if you allow to create an array out of it, every storage operation has to repeat it.

Besides the heavy performance issue, the complexity raises another problem. If you look at the bug tracking list of javac or related questions of Stackoverflow, you may notice that the process is not only complex, but also error prone. Currently, every minor version of javac contains changes and fixes regarding generic type signature matching, affecting what will be accepted or rejected. I’m quite sure, you don’t want intrinsic JVM operations like type casts, variable assignments or array stores to become victim of this complexity, having a different idea of what is legal or not in every version or suddenly rejecting what javac accepted at compile-time due to mismatching rules.

like image 164
Holger Avatar answered Oct 11 '22 13:10

Holger