Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8: grouping a collection by a field and flatten and join a collection as mapped value using stream?

My class has two fields:

  • MyKey - the key that I want to group by
  • Set<MyEnum> - the set that I want to be flattened and merged.

I have a list of such objects, and what I want is to obtain a Map<MyKey, Set<MyEnum> of which the value is joined from all myEnums of the objects with this key.

For example, if I have three objects:

  1. myKey: key1, myEnums: [E1]
  2. myKey: key1, myEnums: [E2]
  3. myKey: key2, myEnums: [E1, E3]

The expected result should be:

key1 => [E1, E2], key2 => [E1, E3]

I came up with this code:

Map<MyKey, Set<MyEnum>> map = myObjs.stream()
        .collect(Collectors.groupingBy(
                MyType::getMyKey,
                Collectors.reducing(
                        new HashSet<MyEnum>(),
                        MyType::getMyEnums,
                        (a, b) -> {
                            a.addAll(b);
                            return a;
                        })));

There're two problems with it:

  1. The HashSet inside the reducing seems to be shared between all keys. That being said the actual run result of the above example is key1 => [E1, E2, E3], key2 => [E1, E2, E3]. Why is it the case?

  2. Even if this code works, it looks ugly especially at the part of reducing that I have to handle the logic of constructing the joined collection manually. Is there a better way of doing this?

Thank you!

like image 747
MGhostSoft Avatar asked Aug 29 '16 21:08

MGhostSoft


1 Answers

Notice that you are only ever creating one identity object: new HashSet<MyEnum>().

The BinaryOperator you supply as the third argument must be idempotent, the same way common math operators are, e.g. x = y + z doesn't change the value of y and z.

This means you need to merge the two input sets a and b, without updating either.

Also, working with enums, you should use EnumSet, not HashSet.

Map<MyKey, Set<MyEnum>> map = myObjs.stream()
        .collect(Collectors.groupingBy(
                    MyType::getMyKey,
                    Collectors.reducing(
                        EnumSet.noneOf(MyEnum.class), // <-- EnumSet
                        MyType::getMyEnums,
                        (a, b) -> {
                            EnumSet<MyEnum> c = EnumSet.copyOf(a); // <-- copy
                            c.addAll(b);
                            return c;
                        })));

UPDATE

Shorter, more streamlined version, that doesn't have to keep creating new sets while accumulating the result:

Map<MyKey, Set<MyEnum>> map = myObjs.stream()
        .collect(Collectors.groupingBy(
                    MyType::getMyKey,
                    Collector.of(
                            () -> EnumSet.noneOf(MyEnum.class),
                            (r, myObj) -> r.addAll(myObj.getMyEnums()),
                            (r1, r2) -> { r1.addAll(r2); return r1; }
                    )));
like image 131
Andreas Avatar answered Nov 15 '22 07:11

Andreas