Java parallel streams: there's a way to navigate a binary tree?

Tags:

I'm struggling to find a proper way to get a speedup from this stream:

    StreamSupport.stream(new BinaryTreeSpliterator(root), true)
                .parallel()
                .map(node -> processor.onerousFunction(node.getValue()))
                .mapToInt(i -> i.intValue())
                .sum()

onerousFunction() is just a function that makes the thread work for a bit and returns the int value of the node.

No matter how many cpus i use, the execution time always remains the same. I think the problem stands in the Spliterator i wrote:

    public class BinaryTreeSpliterator extends AbstractSpliterator<Node> {

        private LinkedBlockingQueue<Node> nodes = new LinkedBlockingQueue<>();

        public BinaryTreeSpliterator(Node root) {
            super(Long.MAX_VALUE, NONNULL | IMMUTABLE);
            this.nodes.add(root);
        }

        @Override
         public boolean tryAdvance(Consumer<? super Node> action) {
            Node current = this.nodes.poll();
            if(current != null) {
                action.accept(current);
                if(current.getLeft() != null) 
                    this.nodes.offer(current.getLeft());
                if(current.getRight() != null)
                    this.nodes.offer(current.getRight());
                return true;
            }
            return false;
        }

    }

But i really can't find a good solution.

399

asked Jan 08 '18 10:01

Stefano Silvi

1 Answers

To process data in parallel, you need a trySplit implementation to return partial data as a new Spliterator instance. The spliterator instances are traversed by a single thread each. So you don’t need a thread safe collection within your spliterator, by the way. But your problem is that you are inheriting the trySplit implementation from AbstractSpliterator which does attempt to provide some parallel support despite not knowing anything about your data.

It does so, by requesting some items sequentially, buffering them into an array and returning a new array based spliterator. Unfortunately, it does not handle “unknown size” very well (the same applies to the parallel stream implementation in general). It will buffer 1024 elements by default, buffering even more the next time, if there are as much elements. Even worse, the stream implementation will not use the array based spliterator’s good splitting capabilities, because it treats “unknown size” like the literal Long.MAX_VALUE, concluding that your spliterator has much more elements than the 1024 elements in the array, hence, will not even try to split the array based spliterator.

Your spliterator can implement a much more suitable trySplit method:

public class BinaryTreeSpliterator extends AbstractSpliterator<Node> {
    /**
     * a node that has not been traversed, but its children are only
     * traversed if contained in this.pending
     * (otherwise a different spliterator might be responsible)
     */
    private Node pendingNode;
    /** pending nodes needing full traversal */
    private ArrayDeque<Node> pending = new ArrayDeque<>();

    public BinaryTreeSpliterator(Node root) {
        super(Long.MAX_VALUE, NONNULL | IMMUTABLE);
        push(root);
    }

    private BinaryTreeSpliterator(Node pending, Node next) {
        super(Long.MAX_VALUE, NONNULL | IMMUTABLE);
        pendingNode = pending;
        if(next!=null) this.pending.offer(next);
    }
    private void push(Node n) {
        if(pendingNode == null) {
            pendingNode = n;
            if(n != null) {
                if(n.getRight()!=null) pending.offerFirst(n.getRight());
                if(n.getLeft() !=null) pending.offerFirst(n.getLeft());
            }
        }
        else pending.offerFirst(n);
    }

    @Override
     public boolean tryAdvance(Consumer<? super Node> action) {
        Node current = pendingNode;
        if(current == null) {
            current = pending.poll();
            if(current == null) return false;
            push(current.getRight());
            push(current.getLeft());
        }
        else pendingNode = null;
        action.accept(current);
        return true;
    }

    @Override
    public void forEachRemaining(Consumer<? super Node> action) {
        Node current = pendingNode;
        if(current != null) {
            pendingNode = null;
            action.accept(current);
        }
        for(;;) {
            current = pending.poll();
            if(current == null) break;
            traverseLocal(action, current);
        }
    }
    private void traverseLocal(Consumer<? super Node> action, Node current) {
        do {
            action.accept(current);
            Node child = current.getLeft();
            if(child!=null) traverseLocal(action, child);
            current = current.getRight();
        } while(current != null);
    }

    @Override
    public Spliterator<Node> trySplit() {
        Node next = pending.poll();
        if(next == null) return null;
        if(pending.isEmpty()) {
            pending.offer(next);
            next = null;
        }
        if(pendingNode==null) return next==null? null: new BinaryTreeSpliterator(next);
        Spliterator<Node> s = new BinaryTreeSpliterator(pendingNode, next);
        pendingNode = null;
        return s;
    }
}

Note that this spliterator would also qualify as ORDERED spliterator, maintaining a top-left-right order. An entirely unordered spliterator could be implemented slightly simpler.

You may implement a more efficient forEachRemaining method than the inherited default, e.g.

@Override
public void forEachRemaining(Consumer<? super Node> action) {
    Node current = pendingNode;
    if(current != null) {
        pendingNode = null;
        action.accept(current);
    }
    for(;;) {
        current = pending.poll();
        if(current == null) break;
        traverseLocal(action, current);
    }
}
private void traverseLocal(Consumer<? super Node> action, Node current) {
    do {
        action.accept(current);
        Node child = current.getLeft();
        if(child!=null) traverseLocal(action, child);
        current = current.getRight();
    } while(current != null);
}

but this method might cause stackoverflow errors, if your application has to deal with unbalanced trees (specifically, very long left paths in this example).

answered Sep 25 '22 14:09

Holger

Related questions
                            
                                Overriding spring-boot application properties when deploying in JBoss
                            
                                Get maven deployment URL during/after deployment
                            
                                Spring JPA, @SqlResultSetMapping mapping to JPA repository
                            
                                SpringBoot : No matching bean found exception
                            
                                Avoid timer overlapping in EJB schedule running inside wildfly
                            
                                How to select which overloaded version of a method to call without using a cast?
                            
                                How to convert a Gregorian date to Julian date with the Java 8 Date/Time API?
                            
                                Remove special character from a column in dataframe
                            
                                Deciphering Stream reduce function
                            
                                org.dbunit.dataset.NoSuchColumnException
                            
                                When i try to build apk(s) on android studio 3 it gives me error
                            
                                Hardcode Hibernate values in Java code
                            
                                Spring Data JPA - How to convert Query result to entity class
                            
                                How to save an email as .eml with X-Unsent: 1 using Message.writeTo
                            
                                Casting type inside Optional
                            
                                logstash-logback-encoder display stacktrace in multiple lines
                            
                                Slow sendmail performance (javamail) with different mail api jars [duplicate]
                            
                                Firebase Authentication : how to get current user's Password?
                            
                                Room @Query error: Cannot find method parameters
                            
                                Spring Reactive MVC vs @EnableAsync

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Java parallel streams: there's a way to navigate a binary tree?

Tags:

java

concurrency

java-stream

binary-tree

Stefano Silvi

People also ask

1 Answers

Holger

Recent Activity

Donate For Us