I would like to learn the time complexity of the given statement below.(In Java8)
list.stream().collect(groupingBy(...));
Any idea?
In Java 8, you retrieve the stream from the list and use a Collector to group them in one line of code. It's as simple as passing the grouping condition to the collector and it is complete. By simply modifying the grouping condition, you can create multiple groups.
The groupingBy() method of Collectors class in Java are used for grouping objects by some property and storing results in a Map instance. In order to use it, we always need to specify a property by which the grouping would be performed. This method provides similar functionality to SQL's GROUP BY clause.
In Java 8 Streams, the flatMap() method applies operation as a mapper function and provides a stream of element values. It means that in each iteration of each element the map() method creates a separate new stream. By using the flattening mechanism, it merges all streams into a single resultant stream.
Some of the important Java 8 features are; forEach() method in Iterable interface. default and static methods in Interfaces. Functional Interfaces and Lambda Expressions.
There is no general answer to that question, as the time complexity depends on all operations. Since the stream has to be processed entirely, there is a base time complexity of O(n)
that has to be multiplied by the costs of all operations done per element. This, assuming that the iteration costs itself are not worse than O(n)
, which is the case for most stream sources.
So, assuming no intermediate operations that affect the time complexity, the groupingBy
has to evaluate the function for each element, which should be independent of other elements, so not affect the time complexity (regardless of how expensive it is, as the O(…)
time complexity only tells us, how the time scales with large numbers of stream elements). Then, it will insert the element into a map, which might depend on the number of already contained elements. Without a custom Map
supplier, the map’s type is unspecified, hence, no statement can be made here.
In practice, it’s reasonable to assume that the result will be some sort of hashing map with a net O(1)
lookup complexity by default. So we have a net time complexity of O(n)
for the grouping. Then, we have the downstream collector.
The default downstream collector is toList()
, which produces an unspecified List
type, so again, we can’t say anything about the costs of adding elements to it.
The current implementation produces an ArrayList
, which has to perform copy operations when the capacity is exceeded, but since the capacity is raised by a factor each time, there is still a net complexity of O(n)
for adding n elements. It’s reasonable to assume that future changes to the toList()
implementation won’t make the costs worse than what we have today. So the time complexity of a default groupingBy
collection is likely O(n)
.
If we use a custom Map
collector with a custom downstream collector, the complexity depends on the average number of groups to number of elements per group ratio. The worst case would be the worst of either, the map’s lookup and the downstream collector’s element processing (times the number of elements), as we could have one group containing all items or each item being in its own group.
But usually, you are capable of predicting a bias for a particular grouping operation, so you would want to calculate a time complexity for that particular operation, instead of relying on a statement about all grouping operations in general.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With