Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are good reasons for choosing invariance in an API like Stream.reduce()?

Reviewing Java 8 Stream API design, I was surprised by the generic invariance on the Stream.reduce() arguments:

<U> U reduce(U identity,
             BiFunction<U,? super T,U> accumulator,
             BinaryOperator<U> combiner)

A seemingly more versatile version of the same API might have applied covariance / contravariance on individual references to U, such as:

<U> U reduce(U identity,
             BiFunction<? super U, ? super T, ? extends U> accumulator,
             BiFunction<? super U, ? super U, ? extends U> combiner)

This would allow for the following, which isn't possible, currently:

// Assuming we want to reuse these tools all over the place:
BiFunction<Number, Number, Double> numberAdder =
    (t, u) -> t.doubleValue() + u.doubleValue();

// This currently doesn't work, but would work with the suggestion
Stream<Number> stream = Stream.of(1, 2L, 3.0);
double sum = stream.reduce(0.0, numberAdder, numberAdder);

Workaround, use method references to "coerce" the types into the target type:

double sum = stream.reduce(0.0, numberAdder::apply, numberAdder::apply);

C# doesn't have this particular problem, as Func(T1, T2, TResult) is defined as follows, using declaration-site variance, which means that any API using Func gets this behaviour for free:

public delegate TResult Func<in T1, in T2, out TResult>(
    T1 arg1,
    T2 arg2
)

What are the advantages (and possibly, the reasons for EG decisions) of the existing design over the suggested design?

Or, asked differently, what are the caveats of the suggested design that I might be overlooking (e.g. type inference difficulties, parallelisation constraints, or constraints specific to the reduction operation such as e.g. associativity, anticipation of a future Java's declaration-site variance on BiFunction<in T, in U, out R>, ...)?

like image 932
Lukas Eder Avatar asked Feb 28 '16 09:02

Lukas Eder


2 Answers

Crawling through the history of the lambda development and isolating "THE" reason for this decision is difficult - so eventually, one will have to wait for one of the developers to answer this question.

Some hints may be the following:

  • The stream interfaces have undergone several iterations and refactorings. In one of the earliest versions of the Stream interface, there have been dedicated reduce methods, and the one that is closest to the reduce method in the question was still called Stream#fold back then. This one already received a BinaryOperator as the combiner parameter.

  • Interestingly, for quite a while, the lambda proposal included a dedicated interface Combiner<T,U,R>. Counterintuitively, this was not used as the combiner in the Stream#reduce function. Instead, it was used as the reducer, which seems to be what nowadays is referred to as the accumulator. However, the Combiner interface was replaced with BiFunction in a later revision.

  • The most striking similarity to the question here is found in a thread about the Stream#flatMap signature at the mailing list, which is then turned into the general question about the variances of the stream method signatures. They fixed these in some places, for example

    As Brian correct me:

    <R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper);

    instead of:

    <R> Stream<R> flatMap(Function<T, Stream<? extends R>> mapper);

    But noticed that in some places, this was not possible:

    T reduce(T identity, BinaryOperator<T> accumulator);

    and

    Optional<T> reduce(BinaryOperator<T> accumulator);

    Can't be fixed because they used 'BinaryOperator', But if 'BiFunction' is used then we have more flexibility

    <U> U reduce(U identity, BiFunction<? super U, ? super T, ? extends U> accumulator, BinaryOperator<U> combiner)

    Instead of:

    <U> U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator<U> combiner);

    Same comment regarding 'BinaryOperator'

    (emphasis by me).


The only justification that I found for not replacing the BinaryOperator with a BiFunction was eventually given in the response to this statement, in the same thread:

BinaryOperator will not be replaced by BiFunction even if, as you said, it introduce more flexibility, a BinaryOperator ask that the two parameters and the return type to be the same so it has conceptually more weight (the EG already votes on that).

Maybe someone can dig out a perticular reference of the vote of the Expert Group that governed this decision, but maybe this quote already sufficiently answers the question of why it is the way it is...

like image 74
Marco13 Avatar answered Oct 19 '22 08:10

Marco13


In my opinion it's just that there's no real use case for the proposed enhancement. The proposed Javadoc has 3 more type parameters and 5 more wildcards. I guess it's enough to simplify the whole thing to the official API because regular Java developers don't want (often are not even able) to lose their mind trying to make the compiler happy. Just for the record, your reduce() has 165 characters in the type signature only.

Also, arguments to .reduce() are often supplied in the form of lambda expressions, so there's no real point in having more versatile versions when such expressions often contain no or very simple business logic and are therefore used only once.

For example I'm a user of your fantastic jOOQ library and also a curious Java developer that loves generics puzzles, but often I miss the simplicity of SQL tuples when I have to put wildcards in my own interfaces because of the type parameter in Result<T> and the kind of troubles it generates when dealing with interfaces of the record types - not that it's a jOOQ fault

like image 40
Raffaele Avatar answered Oct 19 '22 09:10

Raffaele