I would like to learn the time complexity of the given statement below.(In Java8) <pre class="prettyprint"><code>list.stream().collect(groupingBy(...)); </code></pre> Any idea?

There is no general answer to that question, as the time complexity depends on all operations. Since the stream has to be processed entirely, there is a base time complexity of <code>O(n)</code> that has to be multiplied by the costs of all operations done per element. This, assuming that the iteration costs itself are not worse than <code>O(n)</code>, which is the case for most stream sources. So, assuming no intermediate operations that affect the time complexity, the <code>groupingBy</code> has to evaluate the function for each element, which should be independent of other elements, so not affect the time complexity (regardless of how expensive it is, as the <code>O(…)</code> time complexity only tells us, how the time scales with large numbers of stream elements). Then, it will insert the element into a map, which might depend on the number of already contained elements. Without a custom <code>Map</code> supplier, the map’s type is unspecified, hence, no statement can be made here. In practice, it’s reasonable to assume that the result will be some sort of hashing map with a net <code>O(1)</code> lookup complexity by default. So we have a net time complexity of <code>O(n)</code> for the grouping. Then, we have the downstream collector. The default downstream collector is <code>toList()</code>, which produces an unspecified <code>List</code> type, so again, we can’t say anything about the costs of adding elements to it. The current implementation produces an <code>ArrayList</code>, which has to perform copy operations when the capacity is exceeded, but since the capacity is raised by a factor each time, there is still a net complexity of <code>O(n)</code> for adding n elements. It’s reasonable to assume that future changes to the <code>toList()</code> implementation won’t make the costs worse than what we have today. So the time complexity of a default <code>groupingBy</code> collection is likely <code>O(n)</code>. If we use a custom <code>Map</code> collector with a custom downstream collector, the complexity depends on the average number of groups to number of elements per group ratio. The worst case would be the worst of either, the map’s lookup and the downstream collector’s element processing (times the number of elements), as we could have one group containing all items or each item being in its own group. But usually, you are capable of predicting a bias for a particular grouping operation, so you would want to calculate a time complexity for that particular operation, instead of relying on a statement about all grouping operations in general.

Complexity of grouping in Java8

Tags:

java

time-complexity

java-8

java-stream

collectors

I would like to learn the time complexity of the given statement below.(In Java8)

list.stream().collect(groupingBy(...));

Any idea?

970

asked Nov 26 '16 21:11

FreeMan

1 Answers

There is no general answer to that question, as the time complexity depends on all operations. Since the stream has to be processed entirely, there is a base time complexity of O(n) that has to be multiplied by the costs of all operations done per element. This, assuming that the iteration costs itself are not worse than O(n), which is the case for most stream sources.

So, assuming no intermediate operations that affect the time complexity, the groupingBy has to evaluate the function for each element, which should be independent of other elements, so not affect the time complexity (regardless of how expensive it is, as the O(…) time complexity only tells us, how the time scales with large numbers of stream elements). Then, it will insert the element into a map, which might depend on the number of already contained elements. Without a custom Map supplier, the map’s type is unspecified, hence, no statement can be made here.

In practice, it’s reasonable to assume that the result will be some sort of hashing map with a net O(1) lookup complexity by default. So we have a net time complexity of O(n) for the grouping. Then, we have the downstream collector.

The default downstream collector is toList(), which produces an unspecified List type, so again, we can’t say anything about the costs of adding elements to it.

The current implementation produces an ArrayList, which has to perform copy operations when the capacity is exceeded, but since the capacity is raised by a factor each time, there is still a net complexity of O(n) for adding n elements. It’s reasonable to assume that future changes to the toList() implementation won’t make the costs worse than what we have today. So the time complexity of a default groupingBy collection is likely O(n).

If we use a custom Map collector with a custom downstream collector, the complexity depends on the average number of groups to number of elements per group ratio. The worst case would be the worst of either, the map’s lookup and the downstream collector’s element processing (times the number of elements), as we could have one group containing all items or each item being in its own group.

But usually, you are capable of predicting a bias for a particular grouping operation, so you would want to calculate a time complexity for that particular operation, instead of relying on a statement about all grouping operations in general.

158

answered Oct 04 '22 20:10

Holger

Related questions
                            
                                Can not set java.lang.Integer field to java.lang.Integer
                            
                                Putting Freetypefont into libgdx skin
                            
                                Trouble importing android.support.v7.widget.CardView into Eclipse
                            
                                R-Project: xlsx package installation failure (due to java issues)
                            
                                IDEA 14. Decompile my own classes(from output directory)
                            
                                Why the internal implementation of HashSet creates dummy objects to insert as values in HashMap rather than inserting nulls?
                            
                                How to implement basic Spring security (session management) for Single Page AngularJS application
                            
                                Java 8 Stream multithreading
                            
                                Using third-party libraries in Eclipse RCP Tycho app
                            
                                SimpleJson: String to JSONArray
                            
                                Java 8 repeatable custom annotations
                            
                                Find missing integer in a sequential sorted stream
                            
                                BASE64Encoder is internal API and may be removed in future release [duplicate]
                            
                                spring security - expiredUrl not working
                            
                                How to get raw binary data from a POST request processed by Spring?
                            
                                java regex for UUID
                            
                                Return value by lambda in Java
                            
                                org.h2.jdbc.JdbcSQLException: Table "ALL_SEQUENCES" not found
                            
                                stop Spring Scheduled execution if it hangs after some fixed time
                            
                                How to use JGit to get list of changes in files?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With