I think this is a fairly basic question concerning Java 8 streams, but I have a difficult time thinking of the right search terms. So I am asking it here. I am just getting into Java 8, so bear with me.
I was wondering how I could map a stream of tokens to a stream of n-grams (represented as arrays of tokens of size n). Suppose that n = 3, then I would like to convert the following stream
{1, 2, 3, 4, 5, 6, 7}
to
{[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]}
How would I accomplish this with Java 8 streams? It should be possible to compute this concurrently, which is why I am interested in accomplishing this with streams (it also doesn't matter in what order the n-arrays are processed).
Sure, I could do it easily with old-fashioned for-loops, but I would prefer to make use of the stream API.
If you do not have random access to the source data, you can accomplish this with a custom collector:
List<Integer> data = Arrays.asList(1,2,3,4,5,6,7);
List<List<Integer>> result = data.stream().collect(window(3, toList(), toList()));
Here's the source for window
. It is parallel-friendly:
public static <T, I, A, R> Collector<T, ?, R> window(int windowSize, Collector<T, ?, ? extends I> inner, Collector<I, A, R> outer) {
class Window {
final List<T> left = new ArrayList<>(windowSize - 1);
A mid = outer.supplier().get();
Deque<T> right = new ArrayDeque<>(windowSize);
void add(T t) {
right.addLast(t);
if (left.size() == windowSize - 1) {
outer.accumulator().accept(mid, right.stream().collect(inner));
right.removeFirst();
} else {
left.add(t);
}
}
Window merge(Window other) {
other.left.forEach(this::add);
if (other.left.size() == windowSize - 1) {
this.mid = outer.combiner().apply(mid, other.mid);
this.right = other.right;
}
return this;
}
R finish() {
return outer.finisher().apply(mid);
}
}
return Collector.of(Window::new, Window::add, Window::merge, Window::finish);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With