How to break stream computation based on previous results? If it's obvious that stream.filter(...).count() would be less than some number - how to stop stream computation?
I have the following code which checks if some sampleData
passes the predicate
test:
// sampleData.size() may be greater than 10.000.000
Set<String> sampleData = downloadFromWeb();
return sampleData.stream().filter(predicate::test).count() > sampleData.size() * coefficient;
I could have thousands of sampleData
. The problem is that this code is ineffective. For example, if coefficient
equals 0.5
, sampleData.size() = 10_000_000
, and first 5_000_000
elements fails the predicate::test
- there is no reason to validate last 5_000_000
elements (count() will never be greater than 5_000_000
).
ZhekaKozlov’s answer is heading into the right direction, but it lacks the negation. For the matches to be larger than a certain threshold, the number of non matching elements must be smaller than “size - threshold”. If we test for the nonmatching elements to be smaller, we can apply a limit
to stop once they become larger:
Set<String> sampleData = downloadFromWeb();
final long threshold = sampleData.size()-(long)(sampleData.size() * coefficient);
return sampleData.stream()
.filter(predicate.negate()).limit(threshold+1).count() < threshold;
There is, by the way, no reason to create a method reference to the test method of an existing Predicate
like with predicate::test
. Just pass the Predicate
to the filter
method. The code above also uses predicate.negate()
instead of predicate.negate()::test
…
To be honest I am not quite sure this would be correct, I hope someone will come along and review this, but here is my idea of using a custom spliterator:
static class CustomSpl<T> extends AbstractSpliterator<T> {
private Spliterator<T> source;
private int howMany;
private int coefficient;
private Predicate<T> predicate;
private T current;
private long initialSize;
private void setT(T t) {
this.current = t;
}
public CustomSpl(Spliterator<T> source, int howMany, int coefficient, Predicate<T> predicate, long initialSize) {
super(source.estimateSize(), source.characteristics());
this.source = source;
this.howMany = howMany;
this.coefficient = coefficient;
this.predicate = predicate;
this.initialSize = initialSize;
}
@Override
public boolean tryAdvance(Consumer<? super T> action) {
boolean hasMore = source.tryAdvance(this::setT);
System.out.println(current);
if (!hasMore) {
return false;
}
if (predicate.test(current)) {
++howMany;
}
if (initialSize - howMany <= coefficient) {
return false;
}
action.accept(current);
return true;
}
}
And for example this will produce only 4 elements, since we said to only care having a coefficient 5
:
Spliterator<Integer> sp = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).stream().spliterator();
long count = StreamSupport.stream(new CustomSpl<>(sp, 0, 5, x -> x > 3, sp.getExactSizeIfKnown()), false)
.count();
Also this is possible for spliterators with known size only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With