With SQL I can write this to summarize data:
SELECT sum(f1), sum(f2), f3, f4 FROM TABLEX GROUP BY f3, f4
This returns a list with 4 values in each row: sum1, sum2, v3, v4
For example: this is what in table:
1, 2, a, b
1, 2, a, b
2, 2, c, d
2, 2, c, d
3, 4, c, d
The result will be:
2, 4, a, b
7, 8, c, d
Now assume that instead of a database table, I have data in a Java List of objects with variables f1,f2,f3,f4. Is there a function in Java stream API that can summarize this list based on f3, f4?
Introduced in Java 8, the Stream API is used to process collections of objects. A stream is a sequence of objects that supports various methods which can be pipelined to produce the desired result. A stream is not a data structure instead it takes input from the Collections, Arrays or I/O channels.
Java SE 8 introduces the Streams API, which lets you express sophisticated data processing queries. In this article, you've seen that a stream supports many operations such as filter , map , reduce , and iterate that can be combined to write concise and expressive data processing queries.
The groupingBy() method of Collectors class in Java are used for grouping objects by some property and storing results in a Map instance. In order to use it, we always need to specify a property by which the grouping would be performed. This method provides similar functionality to SQL's GROUP BY clause.
The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.
Doing this with the JDK's Stream
API only isn't really straightforward as other answers have shown. This article explains how you can achieve the SQL semantics of GROUP BY
in Java 8 (with standard aggregate functions) and by using jOOλ, a library that extends Stream
for these use-cases. This article shows how other SQL clauses can be mapped to Java 8 Streams
Given your SQL query:
SELECT sum(f1), sum(f2), f3, f4 FROM TABLEX GROUP BY f3, f4
You'd first have to create a named tuple X
(as also explained in this answer):
public class X {
final Double f1;
final Double f2;
final Object f3;
final Object f4;
public X(Double f1, Double f2, Object f3, Object f4) {
this.f1 = f1;
this.f2 = f2;
this.f3 = f3;
this.f4 = f4;
}
}
And then, write:
import static org.jooq.lambda.tuple.Tuple.tuple;
import java.util.Map;
import java.util.stream.Collectors;
import org.jooq.lambda.Seq;
import org.jooq.lambda.tuple.Tuple;
import org.jooq.lambda.tuple.Tuple2;
// ...
Map<
Tuple2<Object, Object>,
Tuple2<Double, Double>
> map =
// FROM TABLEX
Seq.of(
new X(1.0, 2.0, "a", "b"),
new X(1.0, 2.0, "a", "b"),
new X(2.0, 2.0, "c", "d"),
new X(2.0, 2.0, "c", "d"),
new X(3.0, 4.0, "c", "d"))
// GROUP BY f3, f4
.groupBy(
x -> tuple(x.f3, x.f4),
// SELECT SUM(f1), SUM(f2)
Tuple.collectors(
Collectors.summingDouble(x -> x.f1),
Collectors.summingDouble(x -> x.f2)
)
);
A call to
System.out.println(map);
Will then yield
{(c, d)=(7.0, 8.0), (a, b)=(2.0, 4.0)}
If you don't want to create an external dependency, all you have to do is create your own Tuple2
class:
public class Tuple2<T1, T2> {
final T1 t1;
final T2 t2;
// Constructors, toString, equals(), hashCode(), etc.
}
... add your own tuple collector:
static <T, A1, A2, D1, D2> Collector<T, Tuple2<A1, A2>, Tuple2<D1, D2>> collectors(
Collector<T, A1, D1> collector1
, Collector<T, A2, D2> collector2
) {
return Collector.<T, Tuple2<A1, A2>, Tuple2<D1, D2>>of(
() -> tuple(
collector1.supplier().get()
, collector2.supplier().get()
),
(a, t) -> {
collector1.accumulator().accept(a.v1, t);
collector2.accumulator().accept(a.v2, t);
},
(a1, a2) -> tuple(
collector1.combiner().apply(a1.v1, a2.v1)
, collector2.combiner().apply(a1.v2, a2.v2)
),
a -> tuple(
collector1.finisher().apply(a.v1)
, collector2.finisher().apply(a.v2)
)
);
}
And replace Seq
by Stream
and groupBy()
by collect()
and Collectors.groupingBy()
As OP said:
f1,f2
are double values.f3,f4
can be anything.
So let's Consider this class:
public class TABLEX {
double f1;
double f2;
Object f3;
Object f4;
//getters and setters
}
Example of list of TABLEX
List<TABLEX> list = new ArrayList<TABLEX>();
list.add(new TABLEX(1, 2, "a", "b"));
list.add(new TABLEX(1, 2, "a", "b"));
list.add(new TABLEX(2, 2, "c", "d"));
list.add(new TABLEX(2, 2, "c", "d"));
list.add(new TABLEX(3, 4, "c", "d"));
Equivalent of SELECT sum(f1) FROM TABLEX GROUP BY f3, f4
:
Map<Object, Map<Object, Double>> sumF1 = list.stream().collect(
Collectors.groupingBy(
TABLEX::getF3,
Collectors.groupingBy(TABLEX::getF4,
Collectors.summingDouble(TABLEX::getF1))));
System.out.println(sumF1);
Output:
{a={b=2.0}, c={d=7.0}}
Equivalent of SELECT sum(f2) FROM TABLEX GROUP BY f3, f4
:
Map<Object, Map<Object, Double>> sumF2 = list.stream().collect(
Collectors.groupingBy(
TABLEX::getF3,
Collectors.groupingBy(TABLEX::getF4,
Collectors.summingDouble(TABLEX::getF2))));
System.out.println(sumF2);
Output:
{a={b=4.0}, c={d=8.0}}
EDIT:
To get the expected result
sumF1.forEach((f3, m) -> m.forEach((f4, sum) -> System.out.println(sum
+ " " + sumF2.get(f3).get(f4) + " " + f3 + " " + f4)));
Output:
2.0 4.0 a b
7.0 8.0 c d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With