Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Useless expectation from compiler when dealing with generics?

A compiler that must translate a generic type or method (in any language, not just Java) has in principle two choices:

Code specialization. The compiler generates a new representation for every instantiation of a generic type or method. For instance, the compiler would generate code for a list of integers and additional, different code for a list of strings, a list of dates, a list of buffers, and so on.

Code sharing. The compiler generates code for only one representation of a generic type or method and maps all the instantiations of the generic type or method to the unique representation, performing type checks and type conversions where needed.

Java uses code sharing method. I believe C# follows the code specialization method, so all the code below is logical according to me using C#.

Assuming this Java code snippet:

public class Test {

    public static void main(String[] args) {
        Test t = new Test();
        String[] newArray = t.toArray(new String[4]);
    }

    @SuppressWarnings("unchecked")
    public <T> T[] toArray(T[] a) {
        //5 as static size for the sample...
        return (T[]) Arrays.copyOf(a, 5, a.getClass());
    }
}

Code sharing method will lead to this code after type erasure occurs:

public class Test {

    public static void main(String[] args) {
       Test t = new Test();
       //Notice the cast added by the compiler here
       String[] newArray = (String[])t.toArray(new String[4]);
    }

    @SuppressWarnings("unchecked")
    public Object[] toArray(Object[] a) {
       //5 as static size for the sample...
       return Arrays.copyOf(a, 5, a.getClass());
    }
}

So my question is:

What is the need to precise this initial cast? :

(T[]) Arrays.copyOf(a, 5, a.getClass());

instead of doing simply (before type erasure, at coding time):

Arrays.copyOf(a, 5, a.getClass());

Is this cast really necessary for the compiler?

Ok, Arrays.copyOf returns Object[] and are not directly referenceable by a more specific type without explicit downcast.

But can't the compiler make an effort in this case since it deals with a generic type (the return type!)?

Indeed, isn't it enough that compilers apply an explicit cast to the method's caller line? :

(String[])t.toArray(new String[4]);

UPDATED---------------------------------------------------------------------

Thanks to @ruakh for his answer.

Here a sample that proves that explicit cast even just present at compile-time is relevant:

public static void main(String[] args) {
   Test t = new Test();
   String[] newArray = t.toArray(new String[4]);
}


public <T> T[] toArray(T[] a) {
   return (T[]) Arrays.copyOf(a, 5, Object[].class);
}

Casting to T[] is the only way to put some warning to user signaling the cast may not relevant. And indeed, here we end up with a downcast of Object[] to String[], which leads to a ClassCastException at runtime.

So, to the point saying "isn't it enough that compilers apply an explicit cast to the method's caller line", the answer is:

Developer doesn't master this casting since it is created automatically at compilation step, so this runtime feature doesn't warn the user to check deeply his code for its safety BEFORE launching compilation.

To put it a nutshell, this cast is worth to be present.

like image 238
Mik378 Avatar asked Nov 30 '12 00:11

Mik378


People also ask

Are Java generics capable of preventing runtime error?

There are many advantages of using generics in Java. Implementing generics into your code can greatly improve its overall quality by preventing unprecedented runtime errors involving data types and typecasting.

How does the compiler translate Java generics?

Generics in Java are implemented using a type erasure mechanism. The compiler translates all type parameters in the source code to their bounding type in the class file. A type parameter's bounding type is Object if a bound type is not specified.

How do generics work internally?

Generics means parameterized types. The idea is to allow type (Integer, String, … etc., and user-defined types) to be a parameter to methods, classes, and interfaces. Using Generics, it is possible to create classes that work with different data types.


2 Answers

Suppose (conceptually) the type of "a" is List<String>[].

We have no way to obtain the full type of a. We resort to a.getClass() which returns List[]. The copy is also a List[], and you want to cast it to List<String>[]. The compiler cannot reason that the cast is safe. You can reason that, since all elements in the copy are of List<String>. You know better than the compiler, therefore you need an explicit cast, and you are justified to suppress the "unchecked" warning.

Even though your code, like many unchecked casts, works flawlessly on today's Java platforms, it is theoretically wrong. The compiler is not frivolous in issuing the warning. But we have no choice.

The root conflict is the attitude towards erasure.

The compiler lives in the ideal world, as if all types are full types. The compiler does not acknowledge erased types. There is this faint hope that one day Java will implement the ideal type system by dropping erasure, so the compiler today does not work on the assumption of erasure.

We the programmers live in the erased world. We have to work with erased types, and pretend them to be full types, since we don't have access to the real full types.

Our codes only work in the erased world; if Java gets rid of erasure one day, our codes will all break. Casting List[] to List<String>[]? Nonsense! Disallowed!

But we have no choice today. Codes that depend on erasure are ubiquitous. That's a huge problem if Java wants to get rid of erasure. Java probably will never do that. It is cursed.

like image 43
irreputable Avatar answered Sep 29 '22 02:09

irreputable


There are two problems with your line of reasoning.

One problem is that explicit casts are both a compile-time feature (part of the static type system) and a run-time feature (part of the dynamic type system). At compile-time, they convert an expression of one static type to an expression of another static type. At run-time, they ensure type-safety, by enforcing the requirement that the dynamic type is actually a subtype of that static type. In your example, of course, the run-time feature is skipped, because erasure means that there's not enough information to enforce the cast at run-time. But the compile-time feature is still relevant.

Consider this method:

private void printInt(Number n)
{
    Integer i = (Integer) n;
    System.out.println(i + 10);
}

Do you think that the following should be valid:

Object o = 47;
printInt(o);            // note: no cast to Number

on the grounds that foo will immediately cast its argument to Integer anyway, so there's no need to require that callers cast it to Number?

The other problem with your line of reasoning is that, although erasure and unchecked casts do sacrifice a bit of type-safety, the compiler compensates for this sacrifice by issuing warnings. If you write a Java program that does not give any unchecked (or raw-type) warnings, then you can be sure that it won't throw any ClassCastExceptions due to implicit, run-time-only, compiler-generated downcasts. (I mean, unless you're suppressing such warnings, of course.) In your example, you have a method that claims to be generic, and that claims that its return-type is the same as its parameter-type. By providing an explicit cast to T[], you're giving the compiler the opportunity to issue a warning and tell you that it cannot enforce that claim at that location. Without such a cast, there would be no place to warn about the potential resultant ClassCastException in the calling method.

like image 68
ruakh Avatar answered Sep 29 '22 04:09

ruakh