I have been skimming through the news and the source code of the newest LTE Java 17 version and I have encountered with new Stream method called <code>mapMulti</code>. The early-access JavaDoc says it is similar to <code>flatMap</code>. <pre class="prettyprint"><code><R> Stream<R> mapMulti(BiConsumer<? super T,? super Consumer<R>> mapper) </code></pre> <ul> <li>How to perform one to 0..n mapping using this method?</li> <li>How does the new method work and how does it differ from <code>flatMap</code>. When is each one preferable?</li> <li>How many times the <code>mapper</code> can be called?</li> </ul>

<code>Stream::mapMulti</code> is a new method that is classified as an intermediate operation. It requires a <code>BiConsumer<T, Consumer<R>> mapper</code> of the element about to be processed a <code>Consumer</code>. The latter makes the method look strange at the first glance because it is different from what we are used to at the other intermediate methods such as <code>map</code>, <code>filter</code>, or <code>peek</code> where none of them use any variation of <code>*Consumer</code>. The purpose of the <code>Consumer</code> provided right within the lambda expression by the API itself is to accept any number elements to be available in the subsequent pipeline. Therefore, all the elements, regardless of how many, will be propagated. <h3>Explanation using simple snippets</h3> <ul> <li> One to some (0..1) mapping (similar to <code>filter</code>) Using the <code>consumer.accept(R r)</code> for only a few selected items achieves filter-alike pipeline. This might get useful in case of checking the element against a predicate and it's mapping to a different value, which would be otherwise done using a combination of <code>filter</code> and <code>map</code> instead. The following <pre class="prettyprint lang-java prettyprint-override"><code>Stream.of("Java", "Python", "JavaScript", "C#", "Ruby") .mapMulti((str, consumer) -> { if (str.length() > 4) { consumer.accept(str.length()); // lengths larger than 4 } }) .forEach(i -> System.out.print(i + " ")); // 6 10 </code></pre> </li> <li> One to one mapping (similar to <code>map</code>) Working with the previous example, when the condition is omitted and every element is mapped into a new one and accepted using the <code>consumer</code>, the method effectively behaves like <code>map</code>: <pre class="prettyprint lang-java prettyprint-override"><code>Stream.of("Java", "Python", "JavaScript", "C#", "Ruby") .mapMulti((str, consumer) -> consumer.accept(str.length())) .forEach(i -> System.out.print(i + " ")); // 4 6 10 2 4 </code></pre> </li> <li> One to many mapping (similar to <code>flatMap</code>) Here things get interesting because one can call <code>consumer.accept(R r)</code> any number of times. Let's say we want to replicate the number representing the String length by itself, i.e. <code>2</code> becomes <code>2</code>, <code>2</code>. <code>4</code> becomes <code>4</code>, <code>4</code>, <code>4</code>, <code>4</code>. and <code>0</code> becomes nothing. <pre class="prettyprint lang-java prettyprint-override"><code>Stream.of("Java", "Python", "JavaScript", "C#", "Ruby", "") .mapMulti((str, consumer) -> { for (int i = 0; i < str.length(); i++) { consumer.accept(str.length()); } }) .forEach(i -> System.out.print(i + " ")); // 4 4 4 4 6 6 6 6 6 6 10 10 10 10 10 10 10 10 10 10 2 2 4 4 4 4 </code></pre> </li> </ul> <h3>Comparison with flatMap</h3> The very idea of this mechanism is that is can be called multiple times (including zero) and its usage of <code>SpinedBuffer</code> internally allows to push the elements into a single flattened Stream instance without creating a new one for every group of output elements unlike <code>flatMap</code>. The JavaDoc states two use-cases when using this method is preferable over <code>flatMap</code>: <blockquote> <ul> <li>When replacing each stream element with a small (possibly zero) number of elements. Using this method avoids the overhead of creating a new Stream instance for every group of result elements, as required by flatMap.</li> <li>When it is easier to use an imperative approach for generating result elements than it is to return them in the form of a Stream.</li> </ul> </blockquote> Performance-wise, the new method <code>mapMulti</code> is a winner in such cases. Check out the benchmark at the bottom of this answer. <h3>Filter-map scenario</h3> Using this method instead of <code>filter</code> or <code>map</code> separately doesn't make sense due to its verbosity and the fact one intermediate stream is created anyway. The exception might be replacing the <code>.filter(..).map(..)</code> chain called together, which comes handy in the case such as checking the element type and its casting. <pre class="prettyprint lang-java prettyprint-override"><code>int sum = Stream.of(1, 2.0, 3.0, 4F, 5, 6L) .mapMultiToInt((number, consumer) -> { if (number instanceof Integer) { consumer.accept((Integer) number); } }) .sum(); // 6 </code></pre> <pre class="prettyprint lang-java prettyprint-override"><code>int sum = Stream.of(1, 2.0, 3.0, 4F, 5, 6L) .filter(number -> number instanceof Integer) .mapToInt(number -> (Integer) number) .sum(); </code></pre> As seen above, its variations like <code>mapMultiToDouble</code>, <code>mapMultiToInt</code> and <code>mapMultiToLong</code> were introduced. This comes along the <code>mapMulti</code> methods within the primitive Streams such as <code>IntStream mapMulti(IntStream.IntMapMultiConsumer mapper)</code>. Also, three new functional interfaces were introduced. Basically, they are the primitive variations of <code>BiConsumer<T, Consumer<R>></code>, example: <pre class="prettyprint"><code>@FunctionalInterface interface IntMapMultiConsumer { void accept(int value, IntConsumer ic); } </code></pre> <h3>Combined real use-case scenario</h3> The real power of this method is in its flexibility of usage and creating only one Stream at a time, which is the major advantage over <code>flatMap</code>. The two below snippets represent a flatmapping of <code>Product</code> and its <code>List<Variation></code> into <code>0..n</code> offers represented by the <code>Offer</code> class and based on certain conditions (product category and the variation availability). <ul> <li> <code>Product</code> with <code>String name</code>, <code>int basePrice</code>, <code>String category</code> and <code>List<Variation> variations</code>.</li> <li> <code>Variation</code> with <code>String name</code>, <code>int price</code> and <code>boolean availability</code>.</li> </ul> <pre class="prettyprint lang-java prettyprint-override"><code>List<Product> products = ... List<Offer> offers = products.stream() .mapMulti((product, consumer) -> { if ("PRODUCT_CATEGORY".equals(product.getCategory())) { for (Variation v : product.getVariations()) { if (v.isAvailable()) { Offer offer = new Offer( product.getName() + "_" + v.getName(), product.getBasePrice() + v.getPrice()); consumer.accept(offer); } } } }) .collect(Collectors.toList()); </code></pre> <pre class="prettyprint lang-java prettyprint-override"><code>List<Product> products = ... List<Offer> offers = products.stream() .filter(product -> "PRODUCT_CATEGORY".equals(product.getCategory())) .flatMap(product -> product.getVariations().stream() .filter(Variation::isAvailable) .map(v -> new Offer( product.getName() + "_" + v.getName(), product.getBasePrice() + v.getPrice() )) ) .collect(Collectors.toList()); </code></pre> The use of <code>mapMulti</code> is more imperatively inclined compared to the declarative approach of the previous-versions Stream methods combination seen in the latter snippet using <code>flatMap</code>, <code>map</code>, and <code>filter</code>. From this perspective, it depends on the use-case whether is easier to use an imperative approach. Recursion is a good example described in the JavaDoc. <h3>Benchmark</h3> As promised, I have wrote a bunch of micro-benchmarks from ideas collected from the comments. As long as there is quite a lot of code to publish, I have created a GitHub repository with the implementation details and I am about to share the results only. <code>Stream::flatMap(Function)</code> vs <code>Stream::mapMulti(BiConsumer)</code> Source Here we can see the huge difference and a proof the newer method actually works as described and its usage avoid the overhead of creating a new Stream instance with each processed element. <pre class="prettyprint"><code>Benchmark Mode Cnt Score Error Units MapMulti_FlatMap.flatMap avgt 25 73.852 ± 3.433 ns/op MapMulti_FlatMap.mapMulti avgt 25 17.495 ± 0.476 ns/op </code></pre> <code>Stream::filter(Predicate).map(Function)</code> vs <code>Stream::mapMulti(BiConsumer)</code> Source Using chained pipelines (not nested, though) is fine. <pre class="prettyprint"><code>Benchmark Mode Cnt Score Error Units MapMulti_FilterMap.filterMap avgt 25 7.973 ± 0.378 ns/op MapMulti_FilterMap.mapMulti avgt 25 7.765 ± 0.633 ns/op </code></pre> <code>Stream::flatMap(Function)</code> with <code>Optional::stream()</code> vs <code>Stream::mapMulti(BiConsumer)</code> Source This one is very interesting, escpecially in terms of usage (see the source code): we are now able to flatten using <code>mapMulti(Optional::ifPresent)</code> and as expected, the new method is a bit faster in this case. <pre class="prettyprint"><code>Benchmark Mode Cnt Score Error Units MapMulti_FlatMap_Optional.flatMap avgt 25 20.186 ± 1.305 ns/op MapMulti_FlatMap_Optional.mapMulti avgt 25 10.498 ± 0.403 ns/op </code></pre>

To address the scenario <blockquote> When it is easier to use an imperative approach for generating result elements than it is to return them in the form of a Stream. </blockquote> We can see it as now having a limited variant of the yield statement C#. The limitations are that we always need an initial input from a stream, as this is an intermediate operation, further, there’s no short-circuiting for the elements we’re pushing in one function evaluation. Still, it opens interesting opportunities. E.g., implementing a stream of Fibonacci number formerly required a solution using temporary objects capable of holding two values. Now, we can use something like: <pre class="prettyprint lang-java prettyprint-override"><code>IntStream.of(0) .mapMulti((a,c) -> { for(int b = 1; a >=0; b = a + (a = b)) c.accept(a); }) /* additional stream operations here */ .forEach(System.out::println); </code></pre> It stops when the <code>int</code> values overflow, as said, it won’t short-circuit when we use a terminal operation that does not consume all values, however, this loop producing then-ignored values might still be faster than the other approaches. Another example inspired by this answer, to iterate over a class hierarchy from root to most specific: <pre class="prettyprint lang-java prettyprint-override"><code>Stream.of(LinkedHashMap.class).mapMulti(MapMultiExamples::hierarchy) /* additional stream operations here */ .forEach(System.out::println); } </code></pre> <pre class="prettyprint lang-java prettyprint-override"><code>static void hierarchy(Class<?> cl, Consumer<? super Class<?>> co) { if(cl != null) { hierarchy(cl.getSuperclass(), co); co.accept(cl); } } </code></pre> which unlike the old approaches does not require additional heap storage and will likely run faster (assuming reasonable class depths that do not make recursion backfire). Also monsters like this <blockquote> <pre class="prettyprint"><code>List<A> list = IntStream.range(0, r_i).boxed() .flatMap(i -> IntStream.range(0, r_j).boxed() .flatMap(j -> IntStream.range(0, r_k) .mapToObj(k -> new A(i, j, k)))) .collect(Collectors.toList()); </code></pre> </blockquote> can now be written like <pre class="prettyprint lang-java prettyprint-override"><code>List<A> list = IntStream.range(0, r_i).boxed() .<A>mapMulti((i,c) -> { for(int j = 0; j < r_j; j++) { for(int k = 0; k < r_k; k++) { c.accept(new A(i, j, k)); } } }) .collect(Collectors.toList()); </code></pre> Compared to the nested <code>flatMap</code> steps, it loses some parallelism opportunity, which the reference implementation didn’t exploit anyway. For a non-short-circuiting operation like above, the new method likely will benefit from the reduced boxing and less instantiation of capturing lambda expressions. But of course, it should be used judiciously, not to rewrite every construct to an imperative version (after so many people tried to rewrite every imperative code into a functional version)…

When and how to perform one to 0..n mapping Stream mapMulti over flatMap

Tags:

java

java-stream

java-17

flatmap

mapmulti

I have been skimming through the news and the source code of the newest LTE Java 17 version and I have encountered with new Stream method called mapMulti. The early-access JavaDoc says it is similar to flatMap.

<R> Stream<R> mapMulti(BiConsumer<? super T,? super Consumer<R>> mapper)

How to perform one to 0..n mapping using this method?
How does the new method work and how does it differ from flatMap. When is each one preferable?
How many times the mapper can be called?

306

asked Sep 30 '20 07:09

Nikolas Charalambidis

2 Answers

Stream::mapMulti is a new method that is classified as an intermediate operation.

It requires a BiConsumer<T, Consumer<R>> mapper of the element about to be processed a Consumer. The latter makes the method look strange at the first glance because it is different from what we are used to at the other intermediate methods such as map, filter, or peek where none of them use any variation of *Consumer.

The purpose of the Consumer provided right within the lambda expression by the API itself is to accept any number elements to be available in the subsequent pipeline. Therefore, all the elements, regardless of how many, will be propagated.

Explanation using simple snippets

One to some (0..1) mapping (similar to filter)

Using the consumer.accept(R r) for only a few selected items achieves filter-alike pipeline. This might get useful in case of checking the element against a predicate and it's mapping to a different value, which would be otherwise done using a combination of filter and map instead. The following
```
Stream.of("Java", "Python", "JavaScript", "C#", "Ruby")
      .mapMulti((str, consumer) -> {
          if (str.length() > 4) {
              consumer.accept(str.length());  // lengths larger than 4
          }
      })
      .forEach(i -> System.out.print(i + " "));

// 6 10
```
One to one mapping (similar to map)

Working with the previous example, when the condition is omitted and every element is mapped into a new one and accepted using the consumer, the method effectively behaves like map:
```
Stream.of("Java", "Python", "JavaScript", "C#", "Ruby")
      .mapMulti((str, consumer) -> consumer.accept(str.length()))
      .forEach(i -> System.out.print(i + " "));

// 4 6 10 2 4
```

One to many mapping (similar to flatMap)

Here things get interesting because one can call consumer.accept(R r) any number of times. Let's say we want to replicate the number representing the String length by itself, i.e. 2 becomes 2, 2. 4 becomes 4, 4, 4, 4. and 0 becomes nothing.

Stream.of("Java", "Python", "JavaScript", "C#", "Ruby", "")
      .mapMulti((str, consumer) -> {
          for (int i = 0; i < str.length(); i++) {
              consumer.accept(str.length());
          }
      })
      .forEach(i -> System.out.print(i + " "));

// 4 4 4 4 6 6 6 6 6 6 10 10 10 10 10 10 10 10 10 10 2 2 4 4 4 4

Comparison with flatMap

The very idea of this mechanism is that is can be called multiple times (including zero) and its usage of SpinedBuffer internally allows to push the elements into a single flattened Stream instance without creating a new one for every group of output elements unlike flatMap. The JavaDoc states two use-cases when using this method is preferable over flatMap:

When replacing each stream element with a small (possibly zero) number of elements. Using this method avoids the overhead of creating a new Stream instance for every group of result elements, as required by flatMap.

When it is easier to use an imperative approach for generating result elements than it is to return them in the form of a Stream.

Performance-wise, the new method mapMulti is a winner in such cases. Check out the benchmark at the bottom of this answer.

Filter-map scenario

Using this method instead of filter or map separately doesn't make sense due to its verbosity and the fact one intermediate stream is created anyway. The exception might be replacing the .filter(..).map(..) chain called together, which comes handy in the case such as checking the element type and its casting.

int sum = Stream.of(1, 2.0, 3.0, 4F, 5, 6L)
                .mapMultiToInt((number, consumer) -> {
                    if (number instanceof Integer) {
                        consumer.accept((Integer) number);
                    }
                })
                .sum();
// 6

int sum = Stream.of(1, 2.0, 3.0, 4F, 5, 6L)
                .filter(number -> number instanceof Integer)
                .mapToInt(number -> (Integer) number)
                .sum();

As seen above, its variations like mapMultiToDouble, mapMultiToInt and mapMultiToLong were introduced. This comes along the mapMulti methods within the primitive Streams such as IntStream mapMulti(IntStream.IntMapMultiConsumer mapper). Also, three new functional interfaces were introduced. Basically, they are the primitive variations of BiConsumer<T, Consumer<R>>, example:

@FunctionalInterface
interface IntMapMultiConsumer {
    void accept(int value, IntConsumer ic);
}

Combined real use-case scenario

The real power of this method is in its flexibility of usage and creating only one Stream at a time, which is the major advantage over flatMap. The two below snippets represent a flatmapping of Product and its List<Variation> into 0..n offers represented by the Offer class and based on certain conditions (product category and the variation availability).

Product with String name, int basePrice, String category and List<Variation> variations.
Variation with String name, int price and boolean availability.

List<Product> products = ...
List<Offer> offers = products.stream()
        .mapMulti((product, consumer) -> {
            if ("PRODUCT_CATEGORY".equals(product.getCategory())) {
                for (Variation v : product.getVariations()) {
                    if (v.isAvailable()) {
                        Offer offer = new Offer(
                            product.getName() + "_" + v.getName(),
                            product.getBasePrice() + v.getPrice());
                        consumer.accept(offer);
                    }
                }
            }
        })
        .collect(Collectors.toList());

List<Product> products = ...
List<Offer> offers = products.stream()
        .filter(product -> "PRODUCT_CATEGORY".equals(product.getCategory()))
        .flatMap(product -> product.getVariations().stream()
            .filter(Variation::isAvailable)
            .map(v -> new Offer(
                product.getName() + "_" + v.getName(),
                product.getBasePrice() + v.getPrice()
            ))
        )
        .collect(Collectors.toList());

The use of mapMulti is more imperatively inclined compared to the declarative approach of the previous-versions Stream methods combination seen in the latter snippet using flatMap, map, and filter. From this perspective, it depends on the use-case whether is easier to use an imperative approach. Recursion is a good example described in the JavaDoc.

Benchmark

As promised, I have wrote a bunch of micro-benchmarks from ideas collected from the comments. As long as there is quite a lot of code to publish, I have created a GitHub repository with the implementation details and I am about to share the results only.

Stream::flatMap(Function) vs Stream::mapMulti(BiConsumer) Source

Here we can see the huge difference and a proof the newer method actually works as described and its usage avoid the overhead of creating a new Stream instance with each processed element.

Benchmark                                   Mode  Cnt   Score   Error  Units
MapMulti_FlatMap.flatMap                    avgt   25  73.852 ± 3.433  ns/op
MapMulti_FlatMap.mapMulti                   avgt   25  17.495 ± 0.476  ns/op

Stream::filter(Predicate).map(Function) vs Stream::mapMulti(BiConsumer) Source

Using chained pipelines (not nested, though) is fine.

Benchmark                                   Mode  Cnt    Score  Error  Units
MapMulti_FilterMap.filterMap                avgt   25   7.973 ± 0.378  ns/op
MapMulti_FilterMap.mapMulti                 avgt   25   7.765 ± 0.633  ns/op

Stream::flatMap(Function) with Optional::stream() vs Stream::mapMulti(BiConsumer) Source

This one is very interesting, escpecially in terms of usage (see the source code): we are now able to flatten using mapMulti(Optional::ifPresent) and as expected, the new method is a bit faster in this case.

Benchmark                                   Mode  Cnt   Score   Error  Units
MapMulti_FlatMap_Optional.flatMap           avgt   25  20.186 ± 1.305  ns/op
MapMulti_FlatMap_Optional.mapMulti          avgt   25  10.498 ± 0.403  ns/op

158

answered Nov 15 '22 06:11

Nikolas Charalambidis

To address the scenario

When it is easier to use an imperative approach for generating result elements than it is to return them in the form of a Stream.

We can see it as now having a limited variant of the yield statement C#. The limitations are that we always need an initial input from a stream, as this is an intermediate operation, further, there’s no short-circuiting for the elements we’re pushing in one function evaluation.

Still, it opens interesting opportunities.

E.g., implementing a stream of Fibonacci number formerly required a solution using temporary objects capable of holding two values.

Now, we can use something like:

IntStream.of(0)
    .mapMulti((a,c) -> {
        for(int b = 1; a >=0; b = a + (a = b))
            c.accept(a);
    })
    /* additional stream operations here */
    .forEach(System.out::println);

It stops when the int values overflow, as said, it won’t short-circuit when we use a terminal operation that does not consume all values, however, this loop producing then-ignored values might still be faster than the other approaches.

Another example inspired by this answer, to iterate over a class hierarchy from root to most specific:

Stream.of(LinkedHashMap.class).mapMulti(MapMultiExamples::hierarchy)
    /* additional stream operations here */
    .forEach(System.out::println);
}

static void hierarchy(Class<?> cl, Consumer<? super Class<?>> co) {
    if(cl != null) {
        hierarchy(cl.getSuperclass(), co);
        co.accept(cl);
    }
}

which unlike the old approaches does not require additional heap storage and will likely run faster (assuming reasonable class depths that do not make recursion backfire).

Also monsters like this

List<A> list = IntStream.range(0, r_i).boxed()
    .flatMap(i -> IntStream.range(0, r_j).boxed()
        .flatMap(j -> IntStream.range(0, r_k)
            .mapToObj(k -> new A(i, j, k))))
    .collect(Collectors.toList());

can now be written like

List<A> list = IntStream.range(0, r_i).boxed()
    .<A>mapMulti((i,c) -> {
        for(int j = 0; j < r_j; j++) {
            for(int k = 0; k < r_k; k++) {
                c.accept(new A(i, j, k));
            }
        }
    })
    .collect(Collectors.toList());

Compared to the nested flatMap steps, it loses some parallelism opportunity, which the reference implementation didn’t exploit anyway. For a non-short-circuiting operation like above, the new method likely will benefit from the reduced boxing and less instantiation of capturing lambda expressions. But of course, it should be used judiciously, not to rewrite every construct to an imperative version (after so many people tried to rewrite every imperative code into a functional version)…

answered Nov 15 '22 05:11

Holger

Related questions
                            
                                Mocking/stubbing private variables of a class without getter and setter methods
                            
                                Java Proxy -> Why does have proxy object same hashCode like original object
                            
                                Exception in thread "main" com.google.api.client.auth.oauth2.TokenResponseException: 401 Unauthorized
                            
                                Datasource configuring in Hibernate 5, Tomcat 8
                            
                                How to read files from the classpath in aws lambda java
                            
                                How do I get JavaPoet to generate a class literal?
                            
                                Best place to keep secure information in java spring web application
                            
                                Spring boot not overriding Exception using @ControllerAdvice
                            
                                can Java String function indexOf() look for multiple characters?
                            
                                How to distinguish long and double-values when deserializing with moshi?
                            
                                How to return complex types using spark UDFs
                            
                                Write into a word file using JAVA
                            
                                Is it possible to access ThreadLocal variable given a thread object
                            
                                full authentication is required to access this resource - Rest webservice call
                            
                                Kafka consumer configuration / performance issues
                            
                                How to set a blob column in the where clause using spark-connector-api?
                            
                                How HibernateValidator finds ConstraintValidator when validatedBy is empty?
                            
                                Extended server_name (SNI Extension) not sent with jdk1.8.0 but send with jdk1.7.0
                            
                                What is the purpose of using final for the loop variable in enhanced for loop?
                            
                                Can I update a JSF component from a JSF backing bean method?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When and how to perform one to 0..n mapping Stream mapMulti over flatMap

Tags:

java

java-stream

java-17

flatmap

mapmulti

Nikolas Charalambidis

People also ask

2 Answers

Explanation using simple snippets

Comparison with flatMap

Filter-map scenario

Combined real use-case scenario

Benchmark

Nikolas Charalambidis

Holger

Recent Activity

Donate For Us