I'm using counting collector of java 8 to get information about the count of values.
For ex; If I have bunch of streams like
Stream<String> doc1 = Stream.of("a", "b", "c", "b", "c");
Stream<String> doc2 = Stream.of("b", "c", "d");
Stream<Stream<String>> docs = Stream.of(doc1, doc2);
I am able to count the occurrences of each word in a doc by doing
List<Map<String, Long>> collect = docs
.map(doc -> doc.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())))
.collect(Collectors.toList());
This results in a structure as
[
{a=1, b=2, c=2},
{b=1, c=1, d=1}
]
However, I would like to have the count be associated with the docId from which it originated from. For example I would like to have a structure as
[
{a=(randId1, 1), b=(randId1, 2), c=(randId1, 2)},
{b=(randId2, 1), c=(randId2, 1), d=(randId2, 1)}
]
where randId1
and randId2
can be generated at runtime(I just need a way to trace back to a unique source) and ()
represents a Pair class from Apache.
I have tried to wrap the doc in a Pair
of (docId, doc)
but I am stuck at modifying the Collectors.counting()
substitution
List<Map<String, Long>> collect = docs.map(doc -> Pair.of(UUID.randomUUID(), doc))
.map(p -> p.getRight().collect(Collectors.groupingBy(Function.identity(), Collectors.counting())))
.collect(Collectors.toList());
How do I get the output in the format needed ?
This ain't very readable... I've replaced Pair
with AbstractMap.SimpleEntry
since it does the same thing and I already have it on my classpath.
List<Map<String, AbstractMap.SimpleEntry<Long, UUID>>> result = docs.map(doc -> doc.collect(Collectors.collectingAndThen(
Collectors.groupingBy(Function.identity(), Collectors.counting()),
map -> {
UUID rand = UUID.randomUUID();
return map.entrySet().stream().collect(Collectors.toMap(
Entry::getKey,
e -> new AbstractMap.SimpleEntry<>(e.getValue(), rand)));
})))
.collect(Collectors.toList());
System.out.println(result);
And the output of this:
[{a=1=890d7276-efb7-41cc-bda7-f2dd2859e740,
b=2=890d7276-efb7-41cc-bda7-f2dd2859e740,
c=2=890d7276-efb7-41cc-bda7-f2dd2859e740},
{b=1=888d78a5-0dea-4cb2-8686-c06c784d4c66,
c=1=888d78a5-0dea-4cb2-8686-c06c784d4c66,
d=1=888d78a5-0dea-4cb2-8686-c06c784d4c66}]
How about this?
List<Map<String, Pair<UUID, Long>>> collect = docs.map(doc -> {
UUID id = UUID.randomUUID();
return doc.collect(groupingBy(
identity(),
// v--- adapting Collector<?,?,Long> to Collector<?,?,Pair>
collectingAndThen(counting(), n -> Pair.of(id, n))
));
}).collect(Collectors.toList());
I'm just copy your code snippet and adapting your last generic argument Long
to Pair
by Collectors#collectingAndThen:
// v--- the code need to edit is here
List<Map<String, Long>> collect = docs
.map(doc -> doc.collect(Collectors.groupingBy(Function.identity()
// the code need to edit is here ---v
,Collectors.counting())))
.collect(Collectors.toList());
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With