Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to collect a map using "groupingBy" for MULTIPLE elements within a nested structure?

First, a bit of context code:

import java.util.*;
import java.util.concurrent.atomic.DoubleAdder;
import java.util.function.Function;
import java.util.stream.Collectors;

class Scratch {

  static enum Id {A, B, C}
  static class IdWrapper {
    private final Id id;
    public IdWrapper(Id id) {this.id = id;}
    Id getId() { return id; }
  }

  public static void main(String[] args) {
    Map<String, Object> v1 = new HashMap<>();
    v1.put("parents", new HashSet<>(Arrays.asList(new IdWrapper(Id.A), new IdWrapper(Id.B))));
    v1.put("size", 1d);

    Map<String, Object> v2 = new HashMap<>();
    v2.put("parents", new HashSet<>(Arrays.asList(new IdWrapper(Id.B), new IdWrapper(Id.C))));
    v2.put("size", 2d);

    Map<String, Map<String, Object>> allVs = new HashMap<>();
    allVs.put("v1", v1);
    allVs.put("v2", v2);

The above represents the data structure I am dealing with. I have an outer map (key type is irrelevant), that contains inner "property maps" as values. These inner maps use strings to lookup different kind of data.

In the case I am working on, each v1, v2,... represents a "disk". Each disk has a specific size, but can have multiple parents.

Now I need to sum up the sizes per parent Id as Map<Id, Double>. For the above example, that map would be {B=3.0, A=1.0, C=2.0}.

The following code gives the expected result:

    HashMap<Id, DoubleAdder> adders = new HashMap<>();
    allVs.values().forEach(m -> {
        double size = (Double) m.get("size");
        Set<IdWrapper> wrappedIds = (Set<IdWrapper>) m.get("parents");
        wrappedIds.forEach(w -> adders.computeIfAbsent(w.getId(), a -> new DoubleAdder()).add(size));
    });

    System.out.println(adders.keySet().stream()
            .collect(Collectors.toMap(Function.identity(), key -> adders.get(key).doubleValue())));

But the code feels pretty clunky (like the fact that I need a second map for adding up the sizes).

I have a similar case, where there is always exactly one parent, and that can easily be solved using

collect(Collectors.groupingBy(...), Collectors.summingDouble(...);

But I am lost for the "multiple" parents case.

So, question: can the above transformation to compute the required Map<Id, Double> be rewritten using groupingBy()?

And just for the record: the above is just a mcve for the problem I need an answer for. I understand that the "data layout" might look strange. In reality, we actually have distinct classes representing these "disks" for example. But our "framework" also allows for accessing the properties of any object within the database using such IDs and property names. And sometimes, when you have performance issues, then fetching data in such a "raw property map" way is orders of magnitude faster compared to accessing the true "disk" objects themselves. In other words: I can't change anything about the context. My question is solely about rewriting that computation.

( I am constrained to Java8 and "standard" Java libraries, but additional answers for newer versions Java or nice non-standard ways of solving this will be appreciated, too )

like image 236
GhostCat Avatar asked Nov 22 '18 08:11

GhostCat


1 Answers

Here's a single stream pipeline solution:

Map<Id,Double> sums = allVs.values ()
                           .stream () 
                           .flatMap (m -> ((Set<IdWrapper>)m.get ("parents")).stream ()
                                                                             .map (i -> new SimpleEntry<Id,Double>(i.getId(),(Double)m.get ("size"))))
                           .collect (Collectors.groupingBy (Map.Entry::getKey,
                                                            Collectors.summingDouble (Map.Entry::getValue)));

Output:

{B=3.0, A=1.0, C=2.0}

The idea is to convert each inner Map to a Stream of entries where the key is an Id (of the "parents" Set) and the value is the corresponding "size".

Then it's easy to group the Stream into the desired output.

like image 176
Eran Avatar answered Dec 08 '22 00:12

Eran