Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spliterator trySplit return type

I have stumbled upon an interesting detail in java.util.Spliterator (Java 8).

Method trySplit() is supposed to return an instance of Spliterator or null, if it can't be split. The java doc says the following:

 * @return a {@code Spliterator} covering some portion of the
 * elements, or {@code null} if this spliterator cannot be split.

It appears to me as a perfect place to use java.util.Optional. As per javadoc:

 * A container object which may or may not contain a non-null value.

Are there any reasons, why Optional was not used?

Googling did not help much, except this question in lambda-dev mailing list, which was not answered.

like image 304
Andrew Avatar asked May 12 '15 14:05

Andrew


People also ask

What is Spliterator in Java?

An object for traversing and partitioning elements of a source. The source of elements covered by a Spliterator could be, for example, an array, a Collection , an IO channel, or a generator function. A Spliterator may traverse elements individually ( tryAdvance() ) or sequentially in bulk ( forEachRemaining() ).

What is Spliterator used for?

Spliterators, like other Iterators, are for traversing the elements of a source. A source can be a Collection, an IO channel or a generator function. It is included in JDK 8 for support of efficient parallel traversal(parallel programming) in addition to sequential traversal.

What is difference between iterator and Spliterator?

Like Iterator and ListIterator, Spliterator is a Java Iterator, which is used to iterate elements one-by-one from a List implemented object. The main functionalities of Spliterator are: Splitting the source data. Processing the source data.


2 Answers

There are a couple of reasons it's the way it is. Of course, conceptually, trySplit could return Optional<Spliterator<T>>, but there are some design forces that pushed away from this.

One reason is that there's a difference between methods such as findFirst that return Optional vs. methods such as trySplit that return value-or-null.

  • Methods like findFirst are called by and return values to application code.
  • Methods like trySplit are called by and return values to library code.

A design aspect of the JDK class libraries is that the library APIs are (or should be) designed to make things easier for application code, and library code will often take on more complexity in order to make things simpler for applications.

One of the main reasons for Optional is to avoid passing nulls from the library to application code, because improper null handling is a common source of NullPointerExceptions. Instead of null, APIs like findFirst will return an empty Optional, which is supported by a rich set of methods such as orElse, map, filter, flatMap, etc. that provide a great deal of flexibility to applications for dealing with the not-found case.

Note that the nullable return value from trySplit is going in the opposite direction: from the application to the library.

Having application code pass or return a nullable value to the library is considerably less error-prone for the application than having it receive a nullable value from the library. If you're writing an application and the API says that you should pass or return a null to the library, there's no possibility that this will generate an NPE in your code. Indeed, there are a variety of places in the APIs (List.sort(null) comes to mind) where null has particular semantics in the API.

trySplit is called from relatively few places in the library, and the library maintainers are taking on the burden of dealing properly with null in all of those cases.

Another prime consideration is performance. Splitting is in the critical path of setting up a parallel pipeline. It's performed sequentially, before work is handed off to different threads to be executed in parallel. Per Amdahl's Law, in order to make parallelism as efficient as possible, you want to minimize the sequential setup overhead.

The fact is that an Optional is a box, and there is a cost to boxing and unboxing a value to and from an Optional. The JIT compiler might be able to optimize this away in some cases, but it might not. Even if it does, there's a period of time where the code is running but the Optional hasn't yet been optimized away. That's additional overhead. Since the library code is willing to bear the burden of handling null properly, we can guarantee there's no boxing overhead simply by not using Optional at all in this case.

like image 191
Stuart Marks Avatar answered Sep 28 '22 09:09

Stuart Marks


Spliterator is the part of internal stream implementation. It should not be used in business logic where Optional would be convenient. Its quite low-level interface which main goal is to be fast. So there's no reason for Optional there.

You might argue that Optional usually can be eliminated by JIT compiler. However that's not always the case. For example, default max depth of calls for inlining in Hotspot JIT compiler is 10 and usual stream processing has more stack frames, so even one additional stack frame may prevent optimization.

like image 35
Tagir Valeev Avatar answered Sep 28 '22 08:09

Tagir Valeev