I want to determine the minimum area required to display a collection of points. The easy way is to loop through the collection like this:
int minX = Integer.MAX_VALUE;
int maxX = Integer.MIN_VALUE;
int minY = Integer.MAX_VALUE;
int maxY = Integer.MIN_VALUE;
for (Point point: points) {
if (point.x < minX) {
minX = point.x;
}
if (point.x > maxX) {
maxX = point.x;
}
if (point.y < minY) {
minY = point.y;
}
if (point.y > maxY) {
maxY = point.y;
}
}
I am getting to know streams. To do the same, you can do the following:
int minX = points.stream().mapToInt(point -> point.x).min().orElse(-1);
int maxX = points.stream().mapToInt(point -> point.x).max().orElse(-1);
int minY = points.stream().mapToInt(point -> point.y).min().orElse(-1);
int maxY = points.stream().mapToInt(point -> point.y).max().orElse(-1);
Both give the same result. However, although the streams approach is elegant, it is much slower (as expected).
Is there a way to get minX
, maxX
, minY
and maxY
in a single stream operation?
JDK 12 and above has Collectors.teeing
(webrev and CSR), which collects to two different collectors and then merges both partial results into a final result.
You could use it here to collect to two IntSummaryStatistics
for both the x
coordinate and the y
coordinate:
List<IntSummaryStatistics> stats = points.stream()
.collect(Collectors.teeing(
Collectors.mapping(p -> p.x, Collectors.summarizingInt()),
Collectors.mapping(p -> p.y, Collectors.summarizingInt()),
List::of));
int minX = stats.get(0).getMin();
int maxX = stats.get(0).getMax();
int minY = stats.get(1).getMin();
int maxY = stats.get(1).getMax();
Here the first collector gathers statistics for x
and the second one for y
. Then, statistics for both x
and y
are merged into a List
by means of the JDK 9 List.of
factory method that accepts two elements.
An aternative to List::of
for the merger would be:
(xStats, yStats) -> Arrays.asList(xStats, yStats)
If you happen to not have JDK 12 installed on your machine, here's a simplified, generic version of the teeing
method that you can safely use as a utility method:
public static <T, A1, A2, R1, R2, R> Collector<T, ?, R> teeing(
Collector<? super T, A1, R1> downstream1,
Collector<? super T, A2, R2> downstream2,
BiFunction<? super R1, ? super R2, R> merger) {
class Acc {
A1 acc1 = downstream1.supplier().get();
A2 acc2 = downstream2.supplier().get();
void accumulate(T t) {
downstream1.accumulator().accept(acc1, t);
downstream2.accumulator().accept(acc2, t);
}
Acc combine(Acc other) {
acc1 = downstream1.combiner().apply(acc1, other.acc1);
acc2 = downstream2.combiner().apply(acc2, other.acc2);
return this;
}
R applyMerger() {
R1 r1 = downstream1.finisher().apply(acc1);
R2 r2 = downstream2.finisher().apply(acc2);
return merger.apply(r1, r2);
}
}
return Collector.of(Acc::new, Acc::accumulate, Acc::combine, Acc::applyMerger);
}
Please note that I'm not considering the characteristics of the downstream collectors when creating the returned collector.
By analogy with IntSummaryStatistics
, create a class PointStatistics
which collects the information you need. It defines two methods: one for recording values from a Point
, one for combining two Statistics
.
class PointStatistics {
private int minX = Integer.MAX_VALUE;
private int maxX = Integer.MIN_VALUE;
private int minY = Integer.MAX_VALUE;
private int maxY = Integer.MIN_VALUE;
public void accept(Point p) {
minX = Math.min(minX, p.x);
maxX = Math.max(maxX, p.x);
minY = Math.min(minY, p.y);
maxY = Math.max(minY, p.y);
}
public void combine(PointStatistics o) {
minX = Math.min(minX, o.minX);
maxX = Math.max(maxX, o.maxX);
minY = Math.min(minY, o.minY);
maxY = Math.max(maxY, o.maxY);
}
// getters
}
Then you can collect a Stream<Point>
into a PointStatistics
.
class Program {
public static void main(String[] args) {
List<Point> points = new ArrayList<>();
// populate 'points'
PointStatistics statistics = points
.stream()
.collect(PointStatistics::new, PointStatistics::accept, PointStatistics::combine);
}
}
UPDATE
I was completely baffled by the conclusion drawn by OP, so I decided to write JMH benchmarks.
Benchmark settings:
# JMH version: 1.21
# VM version: JDK 1.8.0_171, Java HotSpot(TM) 64-Bit Server VM, 25.171-b11
# Warmup: 1 iterations, 10 s each
# Measurement: 10 iterations, 10 s each
# Timeout: 10 min per iteration
# Benchmark mode: Average time, time/op
For each iteration, I was generating a shared list of random Point
s (new Point(random.nextInt(), random.nextInt())
) of size 100K, 1M, 10M.
The results are
100K
Benchmark Mode Cnt Score Error Units
customCollector avgt 10 6.760 ± 0.789 ms/op
forEach avgt 10 0.255 ± 0.033 ms/op
fourStreams avgt 10 5.115 ± 1.149 ms/op
statistics avgt 10 0.887 ± 0.114 ms/op
twoStreams avgt 10 2.869 ± 0.567 ms/op
1M
Benchmark Mode Cnt Score Error Units
customCollector avgt 10 68.117 ± 4.822 ms/op
forEach avgt 10 3.939 ± 0.559 ms/op
fourStreams avgt 10 57.800 ± 4.817 ms/op
statistics avgt 10 9.904 ± 1.048 ms/op
twoStreams avgt 10 32.303 ± 2.498 ms/op
10M
Benchmark Mode Cnt Score Error Units
customCollector avgt 10 714.016 ± 151.558 ms/op
forEach avgt 10 54.334 ± 9.820 ms/op
fourStreams avgt 10 699.599 ± 138.332 ms/op
statistics avgt 10 148.649 ± 26.248 ms/op
twoStreams avgt 10 429.050 ± 72.879 ms/op
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With