Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Java Collector.toList() require a wildcard type placeholder in its return type?

I've been doing some Java Streams manipulation, and of course it doesn't like my code and is refusing to provide useful error messages. (For reference, I have no problems whatsoever with C# and Linq, so I understand conceptually everything I'm trying to do.) So I started digging into adding the explicit generic types to every method in my code so I can find the source of the problem, as past experience tells me that this is a successful path forward.

While looking around I ran across something I don't understand. Consider the following code from the Java source (reformatted a little bit):

public static <T> Collector<T, ?, List<T>> toList() {
    return new Collectors.CollectorImpl<>(
        (Supplier<List<T>>) ArrayList::new,
        List::add,
        (left, right) -> {
             left.addAll(right);
             return left;
        },
        Collectors.CH_ID
    );
}

Why does the toList method signature require a wildcard ? in its return type? When I remove it, I get

Wrong number of type arguments: 2; required: 3

and

Incompatible types.
Required: Collector<T, List<T>, >
Found: CollectorImpl<java.lang.Object, List<T>, java.lang.Object>

When I change ? to Object, I get (referring to these lines/methods in the above code):

List::add – Cannot resolve method 'add'
left.addAll – Cannot resolve method 'addAll(java.lang.Object)'

When I put the wildcard back and examine these two, they are:

List – public abstract boolean add(T e)
List – public abstract boolean addAll(Collection<? extends T> c)

Further fiddling around hasn't taught me anything more.

I understand that in one scenario such as the ? extends T, the wildcard in Java can be translated over to C# as a new generic type argument with a where TWildCard : T. But what is going on with toList above, where the return type has a bare wildcard?

like image 982
ErikE Avatar asked May 26 '18 00:05

ErikE


People also ask

What is collect collectors toList ())?

The toList() method of Collectors Class is a static (class) method. It returns a Collector Interface that gathers the input data onto a new list. This method never guarantees type, mutability, serializability, or thread-safety of the returned list but for more control toCollection(Supplier) method can be used.

What implementation of List does the collectors toList () create?

toList(), collects the elements into an unmodifiable List. Though the current implementation of the Collectors. toList() creates a mutable List, the method's specification itself makes no guarantee on the type, mutability, serializability, or thread-safety of the List. On the other hand, both Collectors.

Does collector toList create new List?

toList. Returns a Collector that accumulates the input elements into a new List . There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned; if more control over the returned List is required, use toCollection(Supplier) .

What is Collector in Java?

Java Collectors. Collectors is a final class that extends Object class. It provides reduction operations, such as accumulating elements into collections, summarizing elements according to various criteria, etc. Java Collectors class provides various methods to deal with elements.


1 Answers

Collectors have three type parameters:

T - the type of input elements to the reduction operation

A - the mutable accumulation type of the reduction operation (often hidden as an implementation detail)

R - the result type of the reduction operation

For some collectors, such as toList, the types of A and R are the same, because the result itself is used for accumulation.

The actual type of the collector returned from toList would be Collector<T, List<T>, List<T>>.

(An example of a collector which accumulates with a type which is different from its result is Collectors.joining() which uses a StringBuilder.)

The type argument to A is a wildcard most of the time because we don't generally care what it actually is. Its actual type is only used internally by the collector, and we can capture it if we need to refer to it by a name:

// Example of using a collector.
// (No reason to actually write this code, of course.)
public static <T, R> collect(Stream<T> stream,
                             Collector<T, ?, R> c) {
    return captureAndCollect(stream, c);
}
private static <T, A, R> captureAndCollect(Stream<T> stream,
                                           Collector<T, A, R> c) {
    // Create a new A, whatever that is.
    A a = c.supplier().get();

    // Pass the A to the accumulator along with each element.
    stream.forEach(elem -> c.accumulator().accept(a, elem));

    // (We might use combiner() for e.g. parallel collection.)

    // Pass the A to the finisher, which turns it in to a result.
    return c.finisher().apply(a);
}

You can also see in the code for toList that it specifies Collectors.CH_ID as its characteristics, which specifies an identity finish. This means that its finisher does nothing except return whatever is passed to it.


(This section is referenced by my comment below.)

Here are a couple of alternative designs to carrying a type parameter for the accumulator. I think these illustrate why the actual design of the Collector class is good.

  1. Just use Object, but we end up casting a lot.

    interface Collector<T, R> {
        Supplier<Object> supplier();
        BiConsumer<Object, T> accumulator();
        BiFunction<Object, Object, Object> combiner();
        Function<Object, R> finisher();
    }
    
    static <T> Collector<T, List<T>> toList() {
        return Collector.of(
            ArrayList::new,
            (obj, elem) -> ((List<T>) obj).add(elem),
            (a, b) -> {
                ((List<T>) a).addAll((List<T>) b);
                return a;
            },
            obj -> (List<T>) obj);
    }
    
  2. Hide the accumulator as an implementation detail of the Collector, as the Collector itself does the accumulation internally. I think this could make sense, but it's less flexible and the combiner step becomes more complicated.

    interface Collector<T, R> {
        void accumulate(T elem);
        void combine(Collector<T, R> that);
        R finish();
    }
    
    static <T> Collector<T, List<T>> toList() {
        return new Collector<T, List<T>>() {
            private List<T> list = new ArrayList<>();
            @Override
            public void accumulate(T elem) {
                list.add(elem);
            }
            @Override
            public void combine(Collector<T, List<T>> that) {
                // We could elide calling finish()
                // by using instanceof and casting.
                list.addAll(that.finish());
            }
            @Override
            public List<T> finish() {
                return new ArrayList<>(list);
            }
        };
    }
    
like image 70
Radiodef Avatar answered Oct 04 '22 17:10

Radiodef