This is a classic implementation of an immutable linked list:
public abstract class List<A> implements Iterable<A> {
private static final List NIL = new Nil();
public abstract A head();
public abstract List<A> tail();
public List<A> cons(A a) { return new Cons<>(a, this); }
public static <A> List<A> nil() { return NIL; }
@Override
public Iterator<A> iterator() {
return new Iterator<A>() {
private List<A> list = List.this;
@Override
public boolean hasNext() {
return list != NIL;
}
@Override
public A next() {
A n = list.head();
list = list.tail();
return n;
}
};
}
public Stream<A> stream() {
return StreamSupport.stream(spliterator(), false);
}
public Stream<A> parallelStream() {
return StreamSupport.stream(spliterator(), true);
}
}
class Nil extends List {
@Override public Object head() { throw new NoSuchElementException(); }
@Override public List tail() { throw new NoSuchElementException(); }
}
class Cons<A> extends List<A> {
private final A head;
private final List<A> tail;
Cons(A head, List<A> tail) {
this.head = head;
this.tail = tail;
}
@Override public A head() { return head; }
@Override public List<A> tail() { return tail; }
}
The default implementation of spliterator()
does not support efficient parallelizing:
List<Integer> list = List.<Integer> nil().cons(3).cons(2).cons(1);
list.parallelStream().forEach(i -> {
System.out.println(i);
try {
Thread.sleep(1000);
} catch (Exception e) {
e.printStackTrace();
}
});
This will print 1, 2, 3
sequentially.
How to implement spliterator()
to support efficient parallelizing?
Spliterators, like other Iterators, are for traversing the elements of a source. A source can be a Collection, an IO channel or a generator function. It is included in JDK 8 for support of efficient parallel traversal(parallel programming) in addition to sequential traversal.
An object for traversing and partitioning elements of a source. The source of elements covered by a Spliterator could be, for example, an array, a Collection , an IO channel, or a generator function. A Spliterator may traverse elements individually ( tryAdvance() ) or sequentially in bulk ( forEachRemaining() ).
It uses forEachRemaining() method to iterate elements sequentially in a single Thread. It uses trySplit() method to divide itself into Sub-Spliterators to support Parallel Processing. Spliterator supports both Sequential and Parallel processing of data.
Java Spliterator Features It provides tryAdvance() method to iterate elements individually in different threads. It helps in parallel processing. To iterate elements sequentially in a single Thread, use forEachRemaining() method. The trySplit() method is used partition the spliterator, if it is possible.
Prerequisite : Iterators in java. Spliterators, like other Iterators, are for traversing the elements of a source. A source can be a Collection, an IO channel or a generator function. It is included in JDK 8 for support of efficient parallel traversal(parallel programming) in addition to sequential traversal.
The iterator () method is declared in the Iterable interface, It is implemented by AbstractSequentialList class. In LinkedList, it returns an object of the iterator that contains all elements of the LinkedList. The iterator iterates the elements of LinkedList in sequential order. To get the element we have to use the next () method of Iterator.
The trySplit method tries to split it into two parts. Then the caller process elements, and finally, the returned instance will process the others, allowing the two to be processed in parallel. Let's generate our list first: Next, we obtain our Spliterator instance using the spliterator () method. Then we apply our trySplit () method:
The spliterator () method of List Interface creates a Spliterator over the elements in the given list. The spliterator () method returns a Spliterator over the elements in this list.
The spliterators which cannot report even the estimated size (which is default implementation for Iterable
) are poorly split by parallel pipeline. You can fix this issue if you track the size of the List
. In your case it's not very hard to track the exact size:
public abstract class List<A> implements Iterable<A> {
...
public abstract long size();
@Override
public Spliterator<A> spliterator() {
return Spliterators.spliterator(iterator(), size(), Spliterator.ORDERED);
}
}
class Nil extends List {
...
public long size() {
return 0;
}
}
class Cons<A> extends List<A> {
...
private final long size;
Cons(A head, List<A> tail) {
this.head = head;
this.tail = tail;
this.size = tail.size()+1;
}
...
@Override
public long size() {
return size;
}
}
After that parallelization would work better. Note that it's still poor man parallelization, because you cannot quickly jump to the middle of the list, but in many cases it will provide a reasonable speed-up.
Also note that it's better to explicitly specify the Spliterator.ORDERED
characteristic. Otherwise the order may be ignored in parallel stream operations even if it's explicitly requested (for example, by forEachOrdered()
terminal operation).
You might use some algorithm with interleaving - e.g., counting the elements and using remainder after integer division. That could split the elements for parallel iteration.
You could also iterate once before the iterator is even constructed, to split the list into intervals, but that would beat the purpose of stream - e.g. if you use it for anyMatch
, it will slow you a lot.
There is no really efficient way to split a linked list (in less than linear time), unless you create your own implementation of linked list that has some additional information.
Edit: Oh wait - you only implement Iterable
. That is quite limiting, you have to come up with an algorithm that only has one pass. That means, the splitting itself won't be parallel at all, so you might as well enforce your parallelism elsewhere in the process.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With