Bear with me for a while. I know this sounds subjective and argumentative for a while, but I swear there is a question mark at the end, and that the question can actually be answered in an objective way... Coming from a .NET and C# background, I have during recent years been spoiled with the syntactic sugar that generics combined with extension methods provide in many .NET solutions to common problems. One of the key features that make C# generics so extremely powerful is the fact that if there is enough information elsewhere, the compiler can infer the type arguments, so I almost never have to write them out. You don't have to write many lines of code before you realize how many keystrokes you save on that. For example, I can write <pre class="prettyprint"><code>var someStrings = new List<string>(); // fill the list with a couple of strings... var asArray = someStrings.ToArray(); </code></pre> and C# will just know that I mean the first <code>var</code> to be <code>List<string></code>, the second one to be <code>string[]</code> and that <code>.ToArray()</code> really is <code>.ToArray<string>()</code>. Then I come to Java. I have understood enough about Java generics to know that they are fundamentally different, above else in the fact that the compiler doesn't actually compile to generic code - it strips the type arguments and makes it work anyway, in some (quite complicated) way (that I haven't really understood yet). But even though I know generics in Java is fundamentally different, I can't understand why constructs like these are necessary: <pre class="prettyprint"><code> ArrayList<String> someStrings = new ArrayList<String>(); // fill the list with a couple of strings... String[] asArray = someStrings.toArray(new String[0]); // <-- HERE! </code></pre> Why on earth must I instantiate a new <code>String[]</code>, with no elements in it, that won't be used for anything, for the Java compiler to know that it is <code>String[]</code> and not any other type of array I want? I realize that this is the way the overload looks, and that <code>toArray()</code> currently returns an <code>Object[]</code> instead. But why was this decision made when this part of Java was invented? Why is this design better than, say, skipping the <code>.toArray()</code> that returns <code>Object[]</code>overload entirely and just have a <code>toArray()</code> that returns <code>T[]</code>? Is this a limitation in the compiler, or in the imagination of the designers of this part of the framework, or something else? As you can probably tell from my extremely keen interest in things of the utmost unimportance, I haven't slept in a while...

No, most of these reasons are wrong. It has nothing to do with "backward compatibility" or anything like that. It's not because there's a method with a return type of <code>Object[]</code> (many signatures were changed for generics where appropriate). Nor is it because taking an array will save it from reallocating an array. They didn't "leave it out by mistake" or made a bad design decision. They didn't include a <code>T[] toArray()</code> because it can't be written with the way arrays work and the way type erasure works in generics. It is entirely legal to declare a method of <code>List<T></code> to have the signature <code>T[] toArray()</code>. However, there is no way to correctly implement such a method. (Why don't you give it a try as an exercise?) Keep in mind that: <ul> <li>Arrays know at runtime the component type they were created with. Insertions into the array are checked at runtime. And casts from more general array types to more specific array types are checked at runtime. To create an array, you must know the component type at runtime (either using <code>new Foo[x]</code> or using <code>Array.newInstance()</code>).</li> <li>Objects of generic (parameterized) types don't know the type parameters they were created with. The type parameters are erased to their erasure (lower bound), and only those are checked at runtime.</li> </ul> Therefore you can't create an array of a type parameter component type, i.e. <code>new T[...]</code>. In fact, if Lists had a method <code>T[] toArray()</code>, then generic array creation (<code>new T[n]</code>), which is not possible currently, would be possible: <pre class="prettyprint"><code>List<T> temp = new ArrayList<T>(); for (int i = 0; i < n; i++) temp.add(null); T[] result = temp.toArray(); // equivalent to: T[] result = new T[n]; </code></pre> Generics are just a compile-time syntactic sugar. Generics can be added or removed with changing a few declarations and adding casts and stuff, without affecting the actual implementation logic of the code. Let's compare the 1.4 API and 1.5 API: 1.4 API: <pre class="prettyprint"><code>Object[] toArray(); Object[] toArray(Object[] a); </code></pre> Here, we just have a List object. The first method has a declared return type of <code>Object[]</code>, and it creates an object of runtime class <code>Object[]</code>. (Remember that compile-time (static) types of variables and runtime (dynamic) types of objects are different things.) In the second method, suppose we create a <code>String[]</code> object (i.e. <code>new String[0]</code>) and pass that to it. Arrays have a subtyping relationship based on the subtyping of their component types, so <code>String[]</code> is a subclass of <code>Object[]</code>, so this is find. What is most important to note here is that it returns an object of runtime class <code>String[]</code>, even though its declared return type is <code>Object[]</code>. (Again, <code>String[]</code> is a subtype of <code>Object[]</code>, so this is not unusual.) However, if you try to cast the result of the first method to type <code>String[]</code>, you will get a class cast exception, because as noted before, its actual runtime type is <code>Object[]</code>. If you cast the result of the second method (assuming you passed in a <code>String[]</code>) to <code>String[]</code>, it will succeed. So even though you may not notice it (both methods seem to return <code>Object[]</code>), there is already a big fundamental difference in the actual returned object in pre-Generics between these two methods. 1.5 API: <pre class="prettyprint"><code>Object[] toArray(); T[] toArray(T[] a); </code></pre> The exact same thing happens here. Generics adds some nice stuff like checking the argument type of the second method at compile time. But the fundamentals are still the same: The first method creates an object whose real runtime type is <code>Object[]</code>; and the second method creates an object whose real runtime type is the same as the array you passed in. In fact, if you try to pass in an array whose class is actually a subtype of <code>T[]</code>, say <code>U[]</code>, even though we have a <code>List<T></code>, guess what it would do? It will try to put all the elements into a <code>U[]</code> array (which might succeed (if all the elements happen to be of type <code>U</code>), or fail (if not)) return an object whose actual type is <code>U[]</code>. So back to my point earlier. Why can't you make a method <code>T[] toArray()</code>? Because you don't know the the type of array you want to create (either using <code>new</code> or <code>Array.newInstance()</code>). <pre class="prettyprint"><code>T[] toArray() { // what would you put here? } </code></pre> Why can't you just create a <code>new Object[n]</code> and then cast it to <code>T[]</code>? It wouldn't crash immediately (since T is erased inside this method), but when you try to return it to the outside; and assuming the outside code requested a specific array type, e.g. <code>String[] strings = myStringList.toArray();</code>, it would throw an exception, because there's an implicit cast there from generics. People can try all sort of hacks like look at the first element of the list to try to determine the component type, but that doesn't work, because (1) elements can be null, and (2) elements can be a subtype of the actual component type, and creating an array of that type might fail later on when you try to put other elements in, etc. Basically, there is no good way around this.

Are Java generics really this clumsy? Why?

Tags:

java

generics

_{Bear with me for a while. I know this sounds subjective and argumentative for a while, but I swear there is a question mark at the end, and that the question can actually be answered in an objective way...}

Coming from a .NET and C# background, I have during recent years been spoiled with the syntactic sugar that generics combined with extension methods provide in many .NET solutions to common problems. One of the key features that make C# generics so extremely powerful is the fact that if there is enough information elsewhere, the compiler can infer the type arguments, so I almost never have to write them out. You don't have to write many lines of code before you realize how many keystrokes you save on that. For example, I can write

var someStrings = new List<string>();
// fill the list with a couple of strings...
var asArray = someStrings.ToArray();

and C# will just know that I mean the first var to be List<string>, the second one to be string[] and that .ToArray() really is .ToArray<string>().

Then I come to Java.

I have understood enough about Java generics to know that they are fundamentally different, above else in the fact that the compiler doesn't actually compile to generic code - it strips the type arguments and makes it work anyway, in some (quite complicated) way (that I haven't really understood yet). But even though I know generics in Java is fundamentally different, I can't understand why constructs like these are necessary:

 ArrayList<String> someStrings = new ArrayList<String>();
 // fill the list with a couple of strings...
 String[] asArray = someStrings.toArray(new String[0]); // <-- HERE!

Why on earth must I instantiate a new String[], with no elements in it, that won't be used for anything, for the Java compiler to know that it is String[] and not any other type of array I want?

I realize that this is the way the overload looks, and that toArray() currently returns an Object[] instead. But why was this decision made when this part of Java was invented? Why is this design better than, say, skipping the .toArray() that returns Object[]overload entirely and just have a toArray() that returns T[]? Is this a limitation in the compiler, or in the imagination of the designers of this part of the framework, or something else?

_{As you can probably tell from my extremely keen interest in things of the utmost unimportance, I haven't slept in a while...}

705

asked Jan 05 '12 06:01

Tomas Aschan

1 Answers

No, most of these reasons are wrong. It has nothing to do with "backward compatibility" or anything like that. It's not because there's a method with a return type of Object[] (many signatures were changed for generics where appropriate). Nor is it because taking an array will save it from reallocating an array. They didn't "leave it out by mistake" or made a bad design decision. They didn't include a T[] toArray() because it can't be written with the way arrays work and the way type erasure works in generics.

It is entirely legal to declare a method of List<T> to have the signature T[] toArray(). However, there is no way to correctly implement such a method. (Why don't you give it a try as an exercise?)

Keep in mind that:

Arrays know at runtime the component type they were created with. Insertions into the array are checked at runtime. And casts from more general array types to more specific array types are checked at runtime. To create an array, you must know the component type at runtime (either using new Foo[x] or using Array.newInstance()).
Objects of generic (parameterized) types don't know the type parameters they were created with. The type parameters are erased to their erasure (lower bound), and only those are checked at runtime.

Therefore you can't create an array of a type parameter component type, i.e. new T[...].

In fact, if Lists had a method T[] toArray(), then generic array creation (new T[n]), which is not possible currently, would be possible:

List<T> temp = new ArrayList<T>();
for (int i = 0; i < n; i++)
    temp.add(null);
T[] result = temp.toArray();
// equivalent to: T[] result = new T[n];

Generics are just a compile-time syntactic sugar. Generics can be added or removed with changing a few declarations and adding casts and stuff, without affecting the actual implementation logic of the code. Let's compare the 1.4 API and 1.5 API:

1.4 API:

Object[] toArray();
Object[] toArray(Object[] a);

Here, we just have a List object. The first method has a declared return type of Object[], and it creates an object of runtime class Object[]. (Remember that compile-time (static) types of variables and runtime (dynamic) types of objects are different things.)

In the second method, suppose we create a String[] object (i.e. new String[0]) and pass that to it. Arrays have a subtyping relationship based on the subtyping of their component types, so String[] is a subclass of Object[], so this is find. What is most important to note here is that it returns an object of runtime class String[], even though its declared return type is Object[]. (Again, String[] is a subtype of Object[], so this is not unusual.)

However, if you try to cast the result of the first method to type String[], you will get a class cast exception, because as noted before, its actual runtime type is Object[]. If you cast the result of the second method (assuming you passed in a String[]) to String[], it will succeed.

So even though you may not notice it (both methods seem to return Object[]), there is already a big fundamental difference in the actual returned object in pre-Generics between these two methods.

1.5 API:

Object[] toArray();
T[] toArray(T[] a);

The exact same thing happens here. Generics adds some nice stuff like checking the argument type of the second method at compile time. But the fundamentals are still the same: The first method creates an object whose real runtime type is Object[]; and the second method creates an object whose real runtime type is the same as the array you passed in.

In fact, if you try to pass in an array whose class is actually a subtype of T[], say U[], even though we have a List<T>, guess what it would do? It will try to put all the elements into a U[] array (which might succeed (if all the elements happen to be of type U), or fail (if not)) return an object whose actual type is U[].

So back to my point earlier. Why can't you make a method T[] toArray()? Because you don't know the the type of array you want to create (either using new or Array.newInstance()).

T[] toArray() {
    // what would you put here?
}

Why can't you just create a new Object[n] and then cast it to T[]? It wouldn't crash immediately (since T is erased inside this method), but when you try to return it to the outside; and assuming the outside code requested a specific array type, e.g. String[] strings = myStringList.toArray();, it would throw an exception, because there's an implicit cast there from generics.

People can try all sort of hacks like look at the first element of the list to try to determine the component type, but that doesn't work, because (1) elements can be null, and (2) elements can be a subtype of the actual component type, and creating an array of that type might fail later on when you try to put other elements in, etc. Basically, there is no good way around this.

101

answered Sep 18 '22 19:09

newacct

Related questions
                            
                                FileNotFoundException open failed: EPERM (Operation not permitted) during saving image file to internal storage on android
                            
                                How to avoid VS Code warning: "[myfile].java is a non-project file, only syntax errors are reported"
                            
                                How can I detect if caps lock is toggled in Swing?
                            
                                Is there Java library or framework for accessing Serial ports? [closed]
                            
                                Having 2 variables with the same name in a class that extends another class in Java
                            
                                Quick'n'dirty persistence [closed]
                            
                                How to initialize a ByteBuffer if you don't know how many bytes to allocate beforehand?
                            
                                Convert between URL and windows filename (Java)?
                            
                                Why new String(bytes, enc).getBytes(enc) does not return the original byte array?
                            
                                Setting values and display Text in Android Spinner
                            
                                how can I kill a Linux process in java with SIGKILL Process.destroy() does SIGTERM
                            
                                How to cancel a scheduled Quartz job in Spring
                            
                                Sort an ArrayList base on multiple attributes
                            
                                How to extend already defined lists and maps in Spring application context?
                            
                                web-app_2_5.xsd showing errors when validating web.xml in eclipse
                            
                                File Upload using Selenium WebDriver and Java Robot Class
                            
                                How can I force Proguard to keep my .xml resource file?
                            
                                Virtual table/dispatch table
                            
                                Does array changes in method?
                            
                                Maven Groovy and Java + Lombok

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With