Assume we have a collection of objects that are identified by unique String
s, along with a class Tree
that defines a hierarchy on them. That class is implemented using a Map
from nodes (represented by their IDs) to Collection
s of their respective children's IDs.
class Tree {
private Map<String, Collection<String>> edges;
// ...
public Stream<String> descendants(String node) {
// To be defined.
}
}
I would like to enable streaming a node's descendants. A simple solution is this:
private Stream<String> children(String node) {
return edges.getOrDefault(node, Collections.emptyList()).stream();
}
public Stream<String> descendants(String node) {
return Stream.concat(
Stream.of(node),
children(node).flatMap(this::descendants)
);
}
Before continuing, I would like to make the following assertions about this solution. (Am I correct about these?)
Walking the Stream
returned from descendants
consumes resources (time and memory) – relative to the size of the tree – in the same order of complexity as hand-coding the recursion would. In particular, the intermediate objects representing the iteration state (Stream
s, Spliterator
s, ...) form a stack and therefore the memory requirement at any given time is in the same order of complexity as the tree's depth.
As I understand this, as soon as I perform a terminating operation on the Stream
returned from descendants
, the root-level call to flatMap
will cause all contained Stream
s – one for each (recursive) call to descendants
– to be realized immediately. Thus, the resulting Stream
is only lazy on the first level of recursion, but not beyond. (Edited according to Tagir Valeevs answer.)
If I understood these points correctly, my question is this: How can I define descendants
so that the resulting Stream
is lazy?
I would like the solution to be as elegant as possible, in the sense that I prefer a solution which leaves the iteration state implicit. (To clarify what I mean by that: I know that I could write a Spliterator
that walks the tree while maintaining an explicit stack of Spliterator
s on each level. I would like to avoid that.)
(Is there possibly a way in Java to formulate this as a producer-consumer workflow, like one could use in languages like Julia and Go?)
To build a tree in Java, for example, we start with the root node. Node<String> root = new Node<>("root"); Once we have our root, we can add our first child node using addChild , which adds a child node and assigns it to a parent node. We refer to this process as insertion (adding nodes) and deletion (removing nodes).
A Tree is a non-linear data structure where data objects are generally organized in terms of hierarchical relationship. The structure is non-linear in the sense that, unlike Arrays, Linked Lists, Stack and Queues, data in a tree is not organized linearly.
To me, your solution is already as elegant as possible and the limited laziness of it not your fault. The simplest solution is to wait until it gets fixed by the JRE developers. It has been done with Java 10.
However, if this limited laziness of today’s implementation really is a concern, it’s perhaps time to solve this in a general way. Well, it is about implementing a Spliterator
, but not specific to your task. Instead, it’s a re-implementation of the flatmap
operation serving all cases where the limited laziness of the original implementation matters:
public class FlatMappingSpliterator<E,S> extends Spliterators.AbstractSpliterator<E>
implements Consumer<S> {
static final boolean USE_ORIGINAL_IMPL
= Boolean.getBoolean("stream.flatmap.usestandard");
public static <T,R> Stream<R> flatMap(
Stream<T> in, Function<? super T,? extends Stream<? extends R>> mapper) {
if(USE_ORIGINAL_IMPL)
return in.flatMap(mapper);
Objects.requireNonNull(in);
Objects.requireNonNull(mapper);
return StreamSupport.stream(
new FlatMappingSpliterator<>(sp(in), mapper), in.isParallel()
).onClose(in::close);
}
final Spliterator<S> src;
final Function<? super S, ? extends Stream<? extends E>> f;
Stream<? extends E> currStream;
Spliterator<E> curr;
private FlatMappingSpliterator(
Spliterator<S> src, Function<? super S, ? extends Stream<? extends E>> f) {
// actually, the mapping function can change the size to anything,
// but it seems, with the current stream implementation, we are
// better off with an estimate being wrong by magnitudes than with
// reporting unknown size
super(src.estimateSize()+100, src.characteristics()&ORDERED);
this.src = src;
this.f = f;
}
private void closeCurr() {
try { currStream.close(); } finally { currStream=null; curr=null; }
}
public void accept(S s) {
curr=sp(currStream=f.apply(s));
}
@Override
public boolean tryAdvance(Consumer<? super E> action) {
do {
if(curr!=null) {
if(curr.tryAdvance(action))
return true;
closeCurr();
}
} while(src.tryAdvance(this));
return false;
}
@Override
public void forEachRemaining(Consumer<? super E> action) {
if(curr!=null) {
curr.forEachRemaining(action);
closeCurr();
}
src.forEachRemaining(s->{
try(Stream<? extends E> str=f.apply(s)) {
if(str!=null) str.spliterator().forEachRemaining(action);
}
});
}
@SuppressWarnings("unchecked")
private static <X> Spliterator<X> sp(Stream<? extends X> str) {
return str!=null? ((Stream<X>)str).spliterator(): null;
}
@Override
public Spliterator<E> trySplit() {
Spliterator<S> split = src.trySplit();
if(split==null) {
Spliterator<E> prefix = curr;
while(prefix==null && src.tryAdvance(s->curr=sp(f.apply(s))))
prefix=curr;
curr=null;
return prefix;
}
FlatMappingSpliterator<E,S> prefix=new FlatMappingSpliterator<>(split, f);
if(curr!=null) {
prefix.curr=curr;
curr=null;
}
return prefix;
}
}
All you need for using it, is to add a import static
of the flatMap
method to your code and change expressions of the form stream.flatmap(function)
to flatmap(stream, function)
.
I.e. in your code
public Stream<String> descendants(String node) {
return Stream.concat(
Stream.of(node),
flatMap(children(node), this::descendants)
);
}
then you have full lazy behavior. I tested it even with infinite streams…
Note that I added a toggle to allow turning back to the original implementation, e.g. when specifying -Dstream.flatmap.usestandard=true
on the command line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With