Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Java generics really this clumsy? Why?

Tags:

java

generics

Bear with me for a while. I know this sounds subjective and argumentative for a while, but I swear there is a question mark at the end, and that the question can actually be answered in an objective way...

Coming from a .NET and C# background, I have during recent years been spoiled with the syntactic sugar that generics combined with extension methods provide in many .NET solutions to common problems. One of the key features that make C# generics so extremely powerful is the fact that if there is enough information elsewhere, the compiler can infer the type arguments, so I almost never have to write them out. You don't have to write many lines of code before you realize how many keystrokes you save on that. For example, I can write

var someStrings = new List<string>();
// fill the list with a couple of strings...
var asArray = someStrings.ToArray();

and C# will just know that I mean the first var to be List<string>, the second one to be string[] and that .ToArray() really is .ToArray<string>().

Then I come to Java.

I have understood enough about Java generics to know that they are fundamentally different, above else in the fact that the compiler doesn't actually compile to generic code - it strips the type arguments and makes it work anyway, in some (quite complicated) way (that I haven't really understood yet). But even though I know generics in Java is fundamentally different, I can't understand why constructs like these are necessary:

 ArrayList<String> someStrings = new ArrayList<String>();
 // fill the list with a couple of strings...
 String[] asArray = someStrings.toArray(new String[0]); // <-- HERE!

Why on earth must I instantiate a new String[], with no elements in it, that won't be used for anything, for the Java compiler to know that it is String[] and not any other type of array I want?

I realize that this is the way the overload looks, and that toArray() currently returns an Object[] instead. But why was this decision made when this part of Java was invented? Why is this design better than, say, skipping the .toArray() that returns Object[]overload entirely and just have a toArray() that returns T[]? Is this a limitation in the compiler, or in the imagination of the designers of this part of the framework, or something else?

As you can probably tell from my extremely keen interest in things of the utmost unimportance, I haven't slept in a while...

like image 705
Tomas Aschan Avatar asked Jan 05 '12 06:01

Tomas Aschan


People also ask

What are generics in Java?

Generics means parameterized types. The idea is to allow type (Integer, String, … etc., and user-defined types) to be a parameter to methods, classes, and interfaces. Using Generics, it is possible to create classes that work with different data types.

Why are generics in Java important?

Generics enable the use of stronger type-checking, the elimination of casts, and the ability to develop generic algorithms. Without generics, many of the features that we use in Java today would not be possible.

Are generics slow Java?

Generics do not affect runtime performance.


1 Answers

No, most of these reasons are wrong. It has nothing to do with "backward compatibility" or anything like that. It's not because there's a method with a return type of Object[] (many signatures were changed for generics where appropriate). Nor is it because taking an array will save it from reallocating an array. They didn't "leave it out by mistake" or made a bad design decision. They didn't include a T[] toArray() because it can't be written with the way arrays work and the way type erasure works in generics.

It is entirely legal to declare a method of List<T> to have the signature T[] toArray(). However, there is no way to correctly implement such a method. (Why don't you give it a try as an exercise?)

Keep in mind that:

  • Arrays know at runtime the component type they were created with. Insertions into the array are checked at runtime. And casts from more general array types to more specific array types are checked at runtime. To create an array, you must know the component type at runtime (either using new Foo[x] or using Array.newInstance()).
  • Objects of generic (parameterized) types don't know the type parameters they were created with. The type parameters are erased to their erasure (lower bound), and only those are checked at runtime.

Therefore you can't create an array of a type parameter component type, i.e. new T[...].

In fact, if Lists had a method T[] toArray(), then generic array creation (new T[n]), which is not possible currently, would be possible:

List<T> temp = new ArrayList<T>();
for (int i = 0; i < n; i++)
    temp.add(null);
T[] result = temp.toArray();
// equivalent to: T[] result = new T[n];

Generics are just a compile-time syntactic sugar. Generics can be added or removed with changing a few declarations and adding casts and stuff, without affecting the actual implementation logic of the code. Let's compare the 1.4 API and 1.5 API:

1.4 API:

Object[] toArray();
Object[] toArray(Object[] a);

Here, we just have a List object. The first method has a declared return type of Object[], and it creates an object of runtime class Object[]. (Remember that compile-time (static) types of variables and runtime (dynamic) types of objects are different things.)

In the second method, suppose we create a String[] object (i.e. new String[0]) and pass that to it. Arrays have a subtyping relationship based on the subtyping of their component types, so String[] is a subclass of Object[], so this is find. What is most important to note here is that it returns an object of runtime class String[], even though its declared return type is Object[]. (Again, String[] is a subtype of Object[], so this is not unusual.)

However, if you try to cast the result of the first method to type String[], you will get a class cast exception, because as noted before, its actual runtime type is Object[]. If you cast the result of the second method (assuming you passed in a String[]) to String[], it will succeed.

So even though you may not notice it (both methods seem to return Object[]), there is already a big fundamental difference in the actual returned object in pre-Generics between these two methods.

1.5 API:

Object[] toArray();
T[] toArray(T[] a);

The exact same thing happens here. Generics adds some nice stuff like checking the argument type of the second method at compile time. But the fundamentals are still the same: The first method creates an object whose real runtime type is Object[]; and the second method creates an object whose real runtime type is the same as the array you passed in.

In fact, if you try to pass in an array whose class is actually a subtype of T[], say U[], even though we have a List<T>, guess what it would do? It will try to put all the elements into a U[] array (which might succeed (if all the elements happen to be of type U), or fail (if not)) return an object whose actual type is U[].

So back to my point earlier. Why can't you make a method T[] toArray()? Because you don't know the the type of array you want to create (either using new or Array.newInstance()).

T[] toArray() {
    // what would you put here?
}

Why can't you just create a new Object[n] and then cast it to T[]? It wouldn't crash immediately (since T is erased inside this method), but when you try to return it to the outside; and assuming the outside code requested a specific array type, e.g. String[] strings = myStringList.toArray();, it would throw an exception, because there's an implicit cast there from generics.

People can try all sort of hacks like look at the first element of the list to try to determine the component type, but that doesn't work, because (1) elements can be null, and (2) elements can be a subtype of the actual component type, and creating an array of that type might fail later on when you try to put other elements in, etc. Basically, there is no good way around this.

like image 101
newacct Avatar answered Sep 18 '22 19:09

newacct