Optional::stream
returns a Stream containing the value if it is present and else an empty stream. So for a Stream<Optional<T>> optionals
,
optionals.flatMap(Optional::stream)
returns a Stream<T>
containing the present values of all optionals. But regarding the functionality behind it, I‘m unsure how efficient it is to create an own stream for every present value and then flatMapping the stream of streams.
But even in the documentation this is mentioned as an intended use.
Why isn‘t streaming a stream of optionals highly inefficient compared with first filtering all present values and then mapping the optionals to their values?
Optional Class | isPresent() function The isPresent() function in Optional Class is used to evaluate whether the value if assigned to variable is present or not. Syntax value.isPresent() Returns: It returns true if value is assigned otherwise false.
The stream() method of java. util. Optional class in Java is used to get the sequential stream of the only value present in this Optional instance. If there is no value present in this Optional instance, then this method returns returns an empty Stream.
ofNullable(T value) Returns an Optional describing the specified value, if non-null, otherwise returns an empty Optional. isPresent() Return true if there is a value present, otherwise false. get() If a value is present in this Optional, returns the value, otherwise throws NoSuchElementException.
Filtering a Stream of Optionals in Java 1 Introduction. In this article, we're going to talk about how to filter out non-empty values from a Stream of Optionals. 2 Using filter () 3 Using flatMap () List<String> filteredList = listOfOptionals.stream () .flatMap (o -> o.isPresent () ? 4 Java 9's Optional::stream. ... 5 Conclusion. ...
In the first approach, We use the stream filter () method that takes the Predicate condition. In the predicate, Call one of the Optional methods isPresent () method returns if the value inside the optional is not null. Returns false if the value is null. If the isPresent () returns true, next use Optional::get method to retrieve the actual value.
The other option would be to use flatMap with a lambda expression that converts an empty Optional to an empty Stream instance, and non-empty Optional to a Stream instance containing only one element: List<String> filteredList = listOfOptionals.stream () .flatMap (o -> o.isPresent () ?
Otherwise return an empty Optional. The map (Stream::of) call returns an Optional<Stream<Other>>. If a value was present in the input Optional, the returned Optional contains a Stream that contains the single Other result. But if the value was not present, the result is an empty Optional.
I got curious and wrote a simple micro-benchmark using JMH.
And it turns out that there is a quite severe performance penalty for using flatMap(Optional::stream)
over filter(Optional::isPresent).map(Optional::get)
.
Java 16 introduced mapMulti
which is similar to flatMap
in usage and has performance characteristics very close to those of filter
/map
.
Each of my benchmark methods takes a list of Optional<Integer>
and computes the sum of all present values.
I implemented three approaches:
flatMap
as posited the questionfilter
and map
as described in the questionmapMulti
introduced in JDK 16.Note that I did not make use of the flatMapToInt
or mapMultiToInt
methods, which are probably more efficient, because I didn't want to focus on the streams-over-wrapper objects aspect and just compare the usage of streams over Optional
objects.
For all approaches I ran the benchmark with a full list (all values present), a half empty list (every second value present), and an entirely empty list (each optional is empty). The lists are all the same length (arbitrarily picked 10 000 elements each).
The unit for the values is us/op (microseconds per operation, meaning one full stream evaluation).
Approach | Full List | Half Empty List | Empty List |
---|---|---|---|
flatMap |
207.219 ± 1.176 | 175.355 ± 4.955 | 142.986 ± 2.821 |
filter /map
|
12.856 ± 0.375 | 12.086 ± 0.451 | 6.856 ± 0.143 |
mapMulti |
13.990 ± 0.353 | 11.685 ± 0.276 | 7.034 ± 0.199 |
Note that the absolute numbers here are specific to my machine running JDK 16 and are mostly irrelevant anyway. The relative differences are what is important here.
It seems the flatMap
approach is both significantly slower and more variable. If I had to guess the variability comes from increased GC pressure caused by all the Stream
objects created, even for empty results.
Disclaimer: this is obviously just a single made-up example being tested and the benchmark hasn't been peer-reviewed (yet), so don't take these results for granted without further investigation.
Full benchmark code below (note that I turned down some iterations/runtimes to get responses in a reasonable time and hard-coded to use 4 threads. Adapt as required.)
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Optional;
import java.util.concurrent.TimeUnit;
@Fork(value = 1, warmups = 0)
@Warmup(iterations = 5, time = 5)
@Measurement(iterations = 5, time = 5)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Threads(4)
public class MyBenchmark {
@State(Scope.Benchmark)
public static class MyLists {
private static final int LIST_SIZE = 10_000;
public final List<Optional<Integer>> allValues;
public final List<Optional<Integer>> halfEmpty;
public final List<Optional<Integer>> allEmpty;
public MyLists() {
List<Optional<Integer>> allValues = new ArrayList<>(LIST_SIZE);
List<Optional<Integer>> halfEmpty = new ArrayList<>(LIST_SIZE);
List<Optional<Integer>> allEmpty = new ArrayList<>(LIST_SIZE);
for (int i = 0; i < LIST_SIZE; i++) {
Optional<Integer> o = Optional.of(i);
allValues.add(o);
halfEmpty.add(i % 2 == 0 ? o : Optional.empty());
allEmpty.add(Optional.empty());
}
this.allValues = Collections.unmodifiableList(allValues);
this.halfEmpty = Collections.unmodifiableList(halfEmpty);
this.allEmpty = Collections.unmodifiableList(allEmpty);
}
}
@Benchmark
public long filter_and_map_allValues(MyLists lists) {
return filterAndMap(lists.allValues);
}
@Benchmark
public long filter_and_map_halfEmpty(MyLists lists) {
return filterAndMap(lists.halfEmpty);
}
@Benchmark
public long filter_and_map_allEmpty(MyLists lists) {
return filterAndMap(lists.allEmpty);
}
@Benchmark
public long flatMap_allValues(MyLists lists) {
return flatMap(lists.allValues);
}
@Benchmark
public long flatMap_halfEmpty(MyLists lists) {
return flatMap(lists.halfEmpty);
}
@Benchmark
public long flatMap_allEmpty(MyLists lists) {
return flatMap(lists.allEmpty);
}
@Benchmark
public long mapMulti_allValues(MyLists lists) {
return mapMulti(lists.allValues);
}
@Benchmark
public long mapMulti_halfEmpty(MyLists lists) {
return mapMulti(lists.halfEmpty);
}
@Benchmark
public long mapMulti_allEmpty(MyLists lists) {
return mapMulti(lists.allEmpty);
}
private long filterAndMap(List<Optional<Integer>> input) {
return input.stream().filter(Optional::isPresent).map(Optional::get).mapToInt(Integer::intValue).sum();
}
private long flatMap(List<Optional<Integer>> input) {
return input.stream().flatMap(Optional::stream).mapToInt(Integer::intValue).sum();
}
private long mapMulti(List<Optional<Integer>> input) {
// Unfortunately the type witness <Integer> is necessary here, as type inference would otherwise make mapMulti produce a Stream<Object>.
return input.stream().<Integer>mapMulti(Optional::ifPresent).mapToInt(Integer::intValue).sum();
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With