I was presented with an interesting problem by a colleague of mine and I was unable to find a neat and pretty Java 8 solution. The problem is to stream through a list of POJOs and then collect them in a map based on multiple properties - the mapping causes the POJO to occur multiple times Imagine the following POJO: <pre class="prettyprint"><code>private static class Customer { public String first; public String last; public Customer(String first, String last) { this.first = first; this.last = last; } public String toString() { return "Customer(" + first + " " + last + ")"; } } </code></pre> Set it up as a <code>List<Customer></code>: <pre class="prettyprint"><code>// The list of customers List<Customer> customers = Arrays.asList( new Customer("Johnny", "Puma"), new Customer("Super", "Mac")); </code></pre> Alternative 1: Use a <code>Map</code> outside of the "stream" (or rather outside <code>forEach</code>). <pre class="prettyprint"><code>// Alt 1: not pretty since the resulting map is "outside" of // the stream. If parallel streams are used it must be // ConcurrentHashMap Map<String, Customer> res1 = new HashMap<>(); customers.stream().forEach(c -> { res1.put(c.first, c); res1.put(c.last, c); }); </code></pre> Alternative 2: Create map entries and stream them, then <code>flatMap</code> them. IMO it is a bit too verbose and not so easy to read. <pre class="prettyprint"><code>// Alt 2: A bit verbose and "new AbstractMap.SimpleEntry" feels as // a "hard" dependency to AbstractMap Map<String, Customer> res2 = customers.stream() .map(p -> { Map.Entry<String, Customer> firstEntry = new AbstractMap.SimpleEntry<>(p.first, p); Map.Entry<String, Customer> lastEntry = new AbstractMap.SimpleEntry<>(p.last, p); return Stream.of(firstEntry, lastEntry); }) .flatMap(Function.identity()) .collect(Collectors.toMap( Map.Entry::getKey, Map.Entry::getValue)); </code></pre> Alternative 3: This is another one that I came up with the "prettiest" code so far but it uses the three-arg version of <code>reduce</code> and the third parameter is a bit dodgy as found in this question: Purpose of third argument to 'reduce' function in Java 8 functional programming. Furthermore, <code>reduce</code> does not seem like a good fit for this problem since it is mutating and parallel streams may not work with the approach below. <pre class="prettyprint"><code>// Alt 3: using reduce. Not so pretty Map<String, Customer> res3 = customers.stream().reduce( new HashMap<>(), (m, p) -> { m.put(p.first, p); m.put(p.last, p); return m; }, (m1, m2) -> m2 /* <- NOT USED UNLESS PARALLEL */); </code></pre> If the above code is printed like this: <pre class="prettyprint"><code>System.out.println(res1); System.out.println(res2); System.out.println(res3); </code></pre> The result would be: <blockquote> {Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)} {Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)} {Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)} </blockquote> So, now to my question: How should I, in a Java 8 orderly fashion, stream through the <code>List<Customer></code> and then somehow collect it as a <code>Map<String, Customer></code> where you split the whole thing as two keys (<code>first</code> AND <code>last</code>) i.e. the <code>Customer</code> is mapped twice. I do not want to use any 3rd party libraries, I do not want to use a map outside of the stream as in alt 1. Are there any other nice alternatives? The full code can be found on hastebin for simple copy-paste to get the whole thing running.

I think your alternatives 2 and 3 can be re-written to be more clear: Alternative 2: <pre class="prettyprint"><code>Map<String, Customer> res2 = customers.stream() .flatMap( c -> Stream.of(c.first, c.last) .map(k -> new AbstractMap.SimpleImmutableEntry<>(k, c)) ).collect(toMap(Map.Entry::getKey, Map.Entry::getValue)); </code></pre> Alternative 3: Your code abuses <code>reduce</code> by mutating the HashMap. To do mutable reduction, use <code>collect</code>: <pre class="prettyprint"><code>Map<String, Customer> res3 = customers.stream() .collect( HashMap::new, (m,c) -> {m.put(c.first, c); m.put(c.last, c);}, HashMap::putAll ); </code></pre> Note that these are not identical. Alternative 2 will throw an exception if there are duplicate keys while Alternative 3 will silently overwrite the entries. If overwriting entries in case of duplicate keys is what you want, I would personally prefer Alternative 3. It is immediately clear to me what it does. It most closely resembles the iterative solution. I would expect it to be more performant as Alternative 2 has to do a bunch of allocations per customer with all that flatmapping. However, Alternative 2 has a huge advantage over Alternative 3 by separating the production of entries from their aggregation. This gives you a great deal of flexibility. For example, if you want to change Alternative 2 to overwrite entries on duplicate keys instead of throwing an exception, you would simply add <code>(a,b) -> b</code> to <code>toMap(...)</code>. If you decide you want to collect matching entries into a list, all you would have to do is replace <code>toMap(...)</code> with <code>groupingBy(...)</code>, etc.

Java 8 Streams: Map the same object multiple times based on different properties

Tags:

java

functional-programming

lambda

java-8

collectors

I was presented with an interesting problem by a colleague of mine and I was unable to find a neat and pretty Java 8 solution. The problem is to stream through a list of POJOs and then collect them in a map based on multiple properties - the mapping causes the POJO to occur multiple times

Imagine the following POJO:

private static class Customer {     public String first;     public String last;      public Customer(String first, String last) {         this.first = first;         this.last = last;     }      public String toString() {         return "Customer(" + first + " " + last + ")";     } }

Set it up as a List<Customer>:

// The list of customers List<Customer> customers = Arrays.asList(         new Customer("Johnny", "Puma"),         new Customer("Super", "Mac"));

Alternative 1: Use a Map outside of the "stream" (or rather outside forEach).

// Alt 1: not pretty since the resulting map is "outside" of // the stream. If parallel streams are used it must be // ConcurrentHashMap Map<String, Customer> res1 = new HashMap<>(); customers.stream().forEach(c -> {     res1.put(c.first, c);     res1.put(c.last, c); });

Alternative 2: Create map entries and stream them, then flatMap them. IMO it is a bit too verbose and not so easy to read.

// Alt 2: A bit verbose and "new AbstractMap.SimpleEntry" feels as // a "hard" dependency to AbstractMap Map<String, Customer> res2 =         customers.stream()                 .map(p -> {                     Map.Entry<String, Customer> firstEntry = new AbstractMap.SimpleEntry<>(p.first, p);                     Map.Entry<String, Customer> lastEntry = new AbstractMap.SimpleEntry<>(p.last, p);                     return Stream.of(firstEntry, lastEntry);                 })                 .flatMap(Function.identity())                 .collect(Collectors.toMap(                         Map.Entry::getKey, Map.Entry::getValue));

Alternative 3: This is another one that I came up with the "prettiest" code so far but it uses the three-arg version of reduce and the third parameter is a bit dodgy as found in this question: Purpose of third argument to 'reduce' function in Java 8 functional programming. Furthermore, reduce does not seem like a good fit for this problem since it is mutating and parallel streams may not work with the approach below.

// Alt 3: using reduce. Not so pretty Map<String, Customer> res3 = customers.stream().reduce(         new HashMap<>(),         (m, p) -> {             m.put(p.first, p);             m.put(p.last, p);             return m;         }, (m1, m2) -> m2 /* <- NOT USED UNLESS PARALLEL */);

If the above code is printed like this:

System.out.println(res1); System.out.println(res2); System.out.println(res3);

The result would be:

{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}

So, now to my question: How should I, in a Java 8 orderly fashion, stream through the List<Customer> and then somehow collect it as a Map<String, Customer> where you split the whole thing as two keys (first AND last) i.e. the Customer is mapped twice. I do not want to use any 3rd party libraries, I do not want to use a map outside of the stream as in alt 1. Are there any other nice alternatives?

The full code can be found on hastebin for simple copy-paste to get the whole thing running.

995

asked Feb 13 '15 20:02

wassgren

1 Answers

I think your alternatives 2 and 3 can be re-written to be more clear:

Alternative 2:

Map<String, Customer> res2 = customers.stream()     .flatMap(         c -> Stream.of(c.first, c.last)         .map(k -> new AbstractMap.SimpleImmutableEntry<>(k, c))     ).collect(toMap(Map.Entry::getKey, Map.Entry::getValue));

Alternative 3: Your code abuses reduce by mutating the HashMap. To do mutable reduction, use collect:

Map<String, Customer> res3 = customers.stream()     .collect(         HashMap::new,          (m,c) -> {m.put(c.first, c); m.put(c.last, c);},          HashMap::putAll     );

Note that these are not identical. Alternative 2 will throw an exception if there are duplicate keys while Alternative 3 will silently overwrite the entries.

If overwriting entries in case of duplicate keys is what you want, I would personally prefer Alternative 3. It is immediately clear to me what it does. It most closely resembles the iterative solution. I would expect it to be more performant as Alternative 2 has to do a bunch of allocations per customer with all that flatmapping.

However, Alternative 2 has a huge advantage over Alternative 3 by separating the production of entries from their aggregation. This gives you a great deal of flexibility. For example, if you want to change Alternative 2 to overwrite entries on duplicate keys instead of throwing an exception, you would simply add (a,b) -> b to toMap(...). If you decide you want to collect matching entries into a list, all you would have to do is replace toMap(...) with groupingBy(...), etc.

148

answered Sep 21 '22 22:09

Misha

Related questions
                            
                                Compiler complains about "missing return statement" even though it is impossible to reach condition where return statement would be missing
                            
                                Default methods and interfaces extending other interfaces
                            
                                Slow application, frequent JVM hangs with single-CPU setups and Java 12+
                            
                                Java REST client API for Android [closed]
                            
                                Dealing with video (DVDs, .avi .mkv) in Java
                            
                                Docker: Combine multiple images
                            
                                Stateless Session Beans vs. Singleton Session Beans
                            
                                Is there an interactive interpreter for Java? [closed]
                            
                                Spring Data MongoDB: how to implement "entity relationships"?
                            
                                Does Stream.forEach respect the encounter order of sequential streams?
                            
                                Where to find Java 6 JSSE/JCE Source Code?
                            
                                How to get maven to timeout earlier while downloading dependencies?
                            
                                Need memory efficient way to store tons of strings (was: HAT-Trie implementation in java)
                            
                                Java GPU programming [closed]
                            
                                Java variable placed on stack or heap
                            
                                How to handle database migrations in Spring Boot with Hibernate?
                            
                                Creating Java Web Service using Google AppEngine
                            
                                What is a good Java library for Parts-Of-Speech tagging? [closed]
                            
                                Android: CursorLoader, LoaderManager, SQLite
                            
                                Why does the Java increment operator allow narrowing operations without explicit cast? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With