Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java flatmap Iterator<Pair<Stream<A>, Stream<B>>> to Pair<Stream<A>, Stream<B>>

I'm trying to implement a method with the following signature:

public static <A,B> Pair<Stream<A>, Stream<B>> flatten(Iterator<Pair<Stream<A>, Stream<B>>> iterator);

Where the goal of the method is to flatten each of the stream types into a single stream and wrap the output in a pair. I only have an Iterator (not an Iterable) and I can't alter the method signature, so I have to perform the flattening in a single iteration.

My current best implementation is

public static <A,B> Pair<Stream<A>, Stream<B>> flatten(Iterator<Pair<Stream<A>, Stream<B>> iterator) {
    Stream<A> aStream = Stream.empty();
    Stream<B> bStream = Stream.empty();
    while(iterator.hasNext()) {
        Pair<Stream<A>, Stream<B>> elm = iterator.next();
        aStream = Stream.concat(aStream, elm.first);
        bStream = Stream.concat(bStream, elm.second);
    }
    return Pair.of(aStream, bStream);
}

But while this is technically correct I'm not super happy with this for two reasons:

  1. Stream.concat warns against doing this kind of thing because it may lead to a StackOverflowError.
  2. Stylistically I'd rather it be purely functional if possible instead of having to loop over the iterator and re-assign the streams throughout.

It feels like Stream#flatMap should be suited here (after transforming the input Iterator to a Stream using Guava's Streams.stream(Iterator), but it seems to not work because of the Pair type in the middle.

One additional requirement is that any of the iterator/streams may be very large (the input could contain anywhere from a single pair of exceedingly large streams to many of one item streams, for example) so solutions ideally shouldn't contain collecting results into in-memory collections.

like image 356
Mshnik Avatar asked Jun 24 '17 10:06

Mshnik


1 Answers

Well guava's Streams.stream is no magic and it's actually internally just:

StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);

So probably no need to link that to your method while you could use it directly.

And you could use Stream.Builder just for that:

public static <A, B> Pair<Stream<A>, Stream<B>> flatten(Iterator<Pair<Stream<A>, Stream<B>>> iterator) {

    Stream.Builder<Stream<A>> builderA = Stream.builder();
    Stream.Builder<Stream<B>> builderB = Stream.builder();

    iterator.forEachRemaining(pair -> {
        builderA.add(pair.first);
        builderB.add(pair.second);
    });

    return Pair.of(builderA.build().flatMap(Function.identity()), builderB.build().flatMap(Function.identity()));
}
like image 129
Eugene Avatar answered Sep 21 '22 17:09

Eugene