Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Stream API: Looking for elegant way for filterAndMap

The default 'best practice' to filter and map a stream is

Stream<T> source;
// ...
Predicate<T> predicate; // = ...
Function<T, U> mapper; // = ...
Stream<U> dst = source
         .filter(predicate)
         .map(mapper);

In many software projects you will come to a point where the same filter and map operations have to be applied on several streams. For example, a collection of objects of class T should be transformed into a List of objects of class U where U is a subclass of T, and we want only the instances of U. So one could write:

Collection<T> source;
// ...
List<U> dst = source.stream()
            .filter(U.class::isInstance)
            .map(U.class::cast)
            .collect(Collectors.toList());

To generalize this, I wrote a help method, called onlyInstancesOf:

static <T, U> Function<T, Stream<U>> onlyInstancesOf(Class<U> clazz) {
    return t -> clazz.isInstance(t)
            ? Stream.of(clazz.cast(t))
            : Stream.empty();
}

This method then is intended to be used with flatMap:

List<U> dst = source.stream()
            .flatMap(onlyInstancesOf(U.class))
            .collect(Collectors.toList());

Another function I use widly often is optionalPresent to process a stream which contains Optionals:

static <T> Function<Optional<T>, Stream<T>> optionalPresent() {
    return t -> t.map(Stream::of).orElse(Stream.empty());
}

and the usage:

Collection<Optional<T>> source;
// ...
List<T> dst = source.stream()
        .flatMap(optionalPresent())
        .collect(Collectors.toList());

These solutions look elegant on the first view, but they have one big disadvantage: They are more than 10 times slower than the "classical" solution with first filtering and then mapping.

What would you suggest how to handle these often used filter-and-map idioms without violating the DRY-principle?

like image 476
Torsten Fehre Avatar asked Dec 20 '17 13:12

Torsten Fehre


1 Answers

You could use a collector (since you're always collecting any ways) that filters instances of a certain class:

static <T, U extends T> Collector<T, ?, List<U>> onlyInstancesOfCollector(Class<U> clazz) {
    return Collector.of(
            ArrayList::new,
            (acc, e) -> {
                if(clazz.isInstance(e)) {
                    acc.add(clazz.cast(e));
                }
            },
            (a, b) -> {
                a.addAll(b);
                return a;
            });
}

...

List<U> dst = source.stream()
    .collect(onlyInstancesOfCollector(U.class));

Which has better performance characteristics:

Benchmark           Mode  Cnt  Score   Error  Units
Tests.collector     avgt   10  0.171 ± 0.003   s/op
Tests.filterAndMap  avgt   10  0.203 ± 0.005   s/op
Tests.flatmap       avgt   10  0.375 ± 0.012   s/op

The full jmh benchmark:

@BenchmarkMode({ Mode.AverageTime })
@Warmup(iterations = 25)   
@Measurement(iterations = 10)    
@State(Scope.Benchmark)
public class Tests {

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
            .include(Tests.class.getSimpleName())
            .build();
        new Runner(opt).run();
    }

    List<A> input;

    @Setup
    public void setup() {
        Random r = new Random();
        input = new ArrayList<>();
        for(int i = 0; i < 10_000_000; i++) {
            input.add(r.nextInt(2) == 0 ? new A() : new B());
        }
    } 

    @Fork(1)
    @Benchmark
    public List<B> filterAndMap() {
        return input.stream()
            .filter(B.class::isInstance)
            .map(B.class::cast)
            .collect(Collectors.toList());
    }

    @Fork(1)
    @Benchmark
    public List<B> flatmap() {
        return input.stream()
            .flatMap(onlyInstancesOf(B.class))
            .collect(Collectors.toList());
    }

    @Fork(1)
    @Benchmark
    public List<B> collector() {
        return input.stream()
            .collect(onlyInstancesOfCollector(B.class));
    }

    static <T, U> Function<T, Stream<U>> onlyInstancesOf(Class<U> clazz) {
        return t -> clazz.isInstance(t)
                ? Stream.of(clazz.cast(t))
                : Stream.empty();
    }

    static <T, U extends T> Collector<T, ?, List<U>> onlyInstancesOfCollector(Class<U> clazz) {
        return Collector.of(
                ArrayList::new,
                (acc, e) -> {
                    if(clazz.isInstance(e)) {
                        acc.add(clazz.cast(e));
                    }
                },
                (a, b) -> {
                    a.addAll(b);
                    return a;
                });
    }

}

class A {}
class B extends A {}
like image 178
Jorn Vernee Avatar answered Nov 08 '22 07:11

Jorn Vernee