Suppose i have multiple java 8 streams that each stream potentially can be converted into <code>Set<AppStory></code> , now I want with the best performance to aggregate all streams into one DISTINCT stream by ID , sorted by property ("lastUpdate") There are several ways to do what but i want the fastest one , for example: <pre class="prettyprint"><code>Set<AppStory> appStr1 =StreamSupport.stream(splititerato1, true). map(storyId1 -> vertexToStory1(storyId1).collect(toSet()); Set<AppStory> appStr2 =StreamSupport.stream(splititerato2, true). map(storyId2 -> vertexToStory2(storyId1).collect(toSet()); Set<AppStory> appStr3 =StreamSupport.stream(splititerato3, true). map(storyId3 -> vertexToStory3(storyId3).collect(toSet()); Set<AppStory> set = new HashSet<>(); set.addAll(appStr1) set.addAll(appStr2) set.addAll(appStr3) , and than make sort by "lastUpdate".. //POJO Object: public class AppStory implements Comparable<AppStory> { private String storyId; private String ........... many other attributes...... public String getStoryId() { return storyId; } @Override public int compareTo(AppStory o) { return this.getStoryId().compareTo(o.getStoryId()); } } </code></pre> ... but it is the old way. How can I create ONE DISTINCT by ID sorted stream with BEST PERFORMANCE somethink like : <pre class="prettyprint"><code> Set<AppStory> finalSet = distinctStream.sort((v1, v2) -> Integer.compare('not my issue').collect(toSet()) </code></pre> Any Ideas ? BR Vitaly

I think the parallel overhead is much greater than the actual work as you stated in the comments. So let your <code>Stream</code>s do the job in sequential manner. FYI: You should prefer using <code>Stream::concat</code> because slicing operations like <code>Stream::limit</code> can be bypassed by <code>Stream::flatMap</code>. <code>Stream::sorted</code> is collecting every element in the <code>Stream</code> into a <code>List</code>, sort the <code>List</code> and then pushing the elements in the desired order down the pipeline. Then the elements are collected again. So this can be avoided by collecting the elements into a <code>List</code> and do the sorting afterwards. Using a <code>List</code> is a far better choice than using a <code>Set</code> because the order matters (I know there is a <code>LinkedHashSet</code> but you can't sort it). This is the in my opinion the cleanest and maybe the fastest solution since we cannot prove it. <pre class="prettyprint"><code>Stream<AppStory> appStr1 =StreamSupport.stream(splititerato1, false) .map(this::vertexToStory1); Stream<AppStory> appStr2 =StreamSupport.stream(splititerato2, false) .map(this::vertexToStory2); Stream<AppStory> appStr3 =StreamSupport.stream(splititerato3, false) .map(this::vertexToStory3); List<AppStory> stories = Stream.concat(Stream.concat(appStr1, appStr2), appStr3) .distinct().collect(Collectors.toList()); // assuming AppStory::getLastUpdateTime is of type `long` stories.sort(Comparator.comparingLong(AppStory::getLastUpdateTime)); </code></pre>

What is the best way to aggregate Streams into one DISTINCT with Java 8

Tags:

java

java-8

java-stream

Suppose i have multiple java 8 streams that each stream potentially can be converted into Set<AppStory> , now I want with the best performance to aggregate all streams into one DISTINCT stream by ID , sorted by property ("lastUpdate")

There are several ways to do what but i want the fastest one , for example:

Set<AppStory> appStr1 =StreamSupport.stream(splititerato1, true).
map(storyId1 -> vertexToStory1(storyId1).collect(toSet());

Set<AppStory> appStr2 =StreamSupport.stream(splititerato2, true).
map(storyId2 -> vertexToStory2(storyId1).collect(toSet());

Set<AppStory> appStr3 =StreamSupport.stream(splititerato3, true).
map(storyId3 -> vertexToStory3(storyId3).collect(toSet());


Set<AppStory> set = new HashSet<>();
set.addAll(appStr1)
set.addAll(appStr2)
set.addAll(appStr3) , and than make sort by "lastUpdate"..

//POJO Object:
public class AppStory implements Comparable<AppStory> {
private String storyId;
private String ........... many other attributes......
public String getStoryId() {
    return storyId;
}
@Override
public int compareTo(AppStory o) {
    return this.getStoryId().compareTo(o.getStoryId());
   }
}

... but it is the old way.

How can I create ONE DISTINCT by ID sorted stream with BEST PERFORMANCE

somethink like :

  Set<AppStory> finalSet = distinctStream.sort((v1, v2) -> Integer.compare('not my issue').collect(toSet())

Any Ideas ?

Vitaly

225

asked May 15 '16 08:05

VitalyT

1 Answers

I think the parallel overhead is much greater than the actual work as you stated in the comments. So let your Streams do the job in sequential manner.

FYI: You should prefer using Stream::concat because slicing operations like Stream::limit can be bypassed by Stream::flatMap.

Stream::sorted is collecting every element in the Stream into a List, sort the List and then pushing the elements in the desired order down the pipeline. Then the elements are collected again. So this can be avoided by collecting the elements into a List and do the sorting afterwards. Using a List is a far better choice than using a Set because the order matters (I know there is a LinkedHashSet but you can't sort it).

This is the in my opinion the cleanest and maybe the fastest solution since we cannot prove it.

Stream<AppStory> appStr1 =StreamSupport.stream(splititerato1, false)
                                       .map(this::vertexToStory1);
Stream<AppStory> appStr2 =StreamSupport.stream(splititerato2, false)
                                       .map(this::vertexToStory2);
Stream<AppStory> appStr3 =StreamSupport.stream(splititerato3, false)
                                       .map(this::vertexToStory3);

List<AppStory> stories = Stream.concat(Stream.concat(appStr1, appStr2), appStr3)
                               .distinct().collect(Collectors.toList());
// assuming AppStory::getLastUpdateTime is of type `long`
stories.sort(Comparator.comparingLong(AppStory::getLastUpdateTime));

183

answered Oct 05 '22 15:10

Flown

Related questions
                            
                                Java generic compile time error migrating from Java 6 to 7 or 8
                            
                                how to load drl file from file system in drools 6.3.0
                            
                                How to use HashSet to find common elements in two Comparable arrays?
                            
                                Delay starting activity from receiver on screen off
                            
                                Eclipse Java OpenCV unsatisfiedLinkError in Test src folder only
                            
                                IOAcknowledge method is NOT working for SocketIO in Android?
                            
                                why java revert logical operators while compile
                            
                                How can Java deserialize Class objects while preserving their identity to currently loaded Class objects?
                            
                                Speed Up Spring MockMvc Integration Test with Embedded Cassandra
                            
                                How to programmatically publish webservice to tomcat
                            
                                Android | Get OkHTTP Library version at runtime
                            
                                Java Integer addition with String
                            
                                How would I go about creating a custom container view in Android?
                            
                                Get folder from Resources folder JAVA
                            
                                MappingException: Invalid path reference club.name! Associations can only be pointed to directly or via their id property
                            
                                Java Applets and Firefox unresponsive Error message
                            
                                How to build and run BIRT source in Eclipse Mars
                            
                                Hadoop: Unable to load native-hadoop library for your platform
                            
                                can Mybatis support mapping the dynamic columns to an map field of a bean?
                            
                                activemq oom after enabling stomp

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With