Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the minimum element of a stream, but bail out early if it's <= N

I want to find the minimum element of a large (hundreds of millions of elements) IntStream, but I can only use the result if it is > N, so I want to bail out as soon as I find an element <= N. I expect the minimum to be <= N most of the time.

IntStream.min() doesn't short-circuit, so I'd be stuck processing all the elements. The general IntStream.reduce doesn't short-circuit either.

IntStream.noneMatch(x -> x <= N) will ensure that the minimum element is > N and does short-circuit if it isn't, but doesn't actually tell me that minimum. I'd have to maintain state in the predicate (and add synchronization or be limited to sequential streams) to remember the actual minimum. Alternately, I could increment N and try again, possibly doing some sort of binary search over a likely range of N, but that sounds both slow and complicated.

How can I find the minimum of an IntStream, short-circuiting as soon as it's known to be <= N?

like image 501
Jeffrey Bosboom Avatar asked Mar 02 '15 01:03

Jeffrey Bosboom


1 Answers

My first thought was to use findAny but then I reread the question. :-)

The difference, of course, is that findAny (or findFirst) short-circuits as soon as it finds a matching element. You want to short-circuit if a non-matching element is found, then reduce or accumulate the rest.

While accumulation is a form of mutation and is frowned upon by FP purists, it really comes in handy sometimes. Java 8 has some nice additions like LongAccumulator that do accumulation of primitive values in a low-contention way, making them suitable for parallel processing. You can put the accumulation step into the peek stream operation.

Unfortunately there is no "IntAccumulator" so you have to do processing using long values. You can either turn your source into a LongStream or map the int values into long values.

Once you've done that, it's probably most straightforward to handle short-circuiting using allMatch. Naturally, if allMatch returns false, short-circuiting has occurred and the accumulator doesn't actually have the minimum value. But it should have the minimum value examined so far, probably the one that triggered the short-circuiting.

Putting it together would look something like this:

IntStream istream = ... 
LongAccumulator acc = new LongAccumulator(Long::min, Long.MAX_VALUE);

if (istream.mapToLong(i -> i).peek(acc::accumulate).allMatch(i -> i > N)) {
    System.out.println("min was " + acc.get());
} else {
    System.out.println("a value was <= " + N);
}
like image 150
Stuart Marks Avatar answered Nov 28 '22 08:11

Stuart Marks