I have the following function for the unification of multiple collections (includes repeated elements):
public static <T> List<T> unify(Collection<T>... collections) {
return Arrays.stream(collections)
.flatMap(Collection::stream)
.collect(Collectors.toList());
}
It would be nice to have a function with a similar signature for the intersection of collections (using type equality). For example:
public static <T> List<T> intersect(Collection<T>... collections) {
//Here is where the magic happens
}
I found an implementation of the intersect function, but it doesnt use streams:
public static <T> Set<T> intersect(Collection<? extends Collection<T>> collections) {
Set<T> common = new LinkedHashSet<T>();
if (!collections.isEmpty()) {
Iterator<? extends Collection<T>> iterator = collections.iterator();
common.addAll(iterator.next());
while (iterator.hasNext()) {
common.retainAll(iterator.next());
}
}
return common;
}
Is there any way to implement something similar to the unify function making use of streams? Im not so experienced in java8/stream api, because of that some advice would be really helpful.
The Stream interface in Java API provides useful methods that make it easier to process collections. Let's take a look at two of its methods – concat () and flatMap () – that are used for combining collections. Once you obtain a Stream, you can perform aggregate operations on it. 2.1. Using the concat () Method
In this article Lambda Expression with Collections are discussed with examples of sorting different collections like ArrayList, TreeSet, TreeMap, etc. We can use Comparator interface to sort, It only contain one abstract method: – compare (). An interface that only contain only a single abstract method then it is called as Functional Interface.
Using Java 8 Stream API The Stream interface in Java API provides useful methods that make it easier to process collections. Let's take a look at two of its methods – concat () and flatMap () – that are used for combining collections. Once you obtain a Stream, you can perform aggregate operations on it. 2.1. Using the concat () Method
The Stream interface in Java API provides useful methods that make it easier to process collections. Let's take a look at two of its methods – concat () and flatMap () – that are used for combining collections. Once you obtain a Stream, you can perform aggregate operations on it. 2.1.
You can write your own collector in some utility class and use it:
public static <T, S extends Collection<T>> Collector<S, ?, Set<T>> intersecting() {
class Acc {
Set<T> result;
void accept(S s) {
if(result == null) result = new HashSet<>(s);
else result.retainAll(s);
}
Acc combine(Acc other) {
if(result == null) return other;
if(other.result != null) result.retainAll(other.result);
return this;
}
}
return Collector.of(Acc::new, Acc::accept, Acc::combine,
acc -> acc.result == null ? Collections.emptySet() : acc.result,
Collector.Characteristics.UNORDERED);
}
The usage would be pretty simple:
Set<T> result = Arrays.stream(collections).collect(MyCollectors.intersecting());
Note however that collector cannot short-circuit: even if intermediate result will be an empty collection, it will still process the rest of the stream.
Such collector is readily available in my free StreamEx library (see MoreCollectors.intersecting()
). It works with normal streams like above, but if you use it with StreamEx (which extends normal stream) it becomes short-circuiting: the processing may actually stop early.
While it’s tempting to think of retainAll
as a black-box bulk operation that must be the most efficient way to implement an intersection operation, it just implies iterating over the entire collection and testing for each element whether it is contained in the collection passed as argument. The fact that you are calling it on a Set
does not imply any advantage, as it is the other collection, whose contains
method will determine the overall performance.
This implies that linearly scanning a set and testing each element for containment within all other collections will be on par with performing retainAll
for each collection. Bonus points for iterating over the smallest collection in the first place:
public static <T> Set<T> intersect(Collection<? extends Collection<T>> collections) {
if(collections.isEmpty()) return Collections.emptySet();
Collection<T> smallest
= Collections.min(collections, Comparator.comparingInt(Collection::size));
return smallest.stream().distinct()
.filter(t -> collections.stream().allMatch(c -> c==smallest || c.contains(t)))
.collect(Collectors.toSet());
}
or, alternatively
public static <T> Set<T> intersect(Collection<? extends Collection<T>> collections) {
if(collections.isEmpty()) return Collections.emptySet();
Collection<T> smallest
= Collections.min(collections, Comparator.comparingInt(Collection::size));
HashSet<T> result=new HashSet<>(smallest);
result.removeIf(t -> collections.stream().anyMatch(c -> c!=smallest&& !c.contains(t)));
return result;
}
I think maybe it would make more sense to use Set instead of List (maybe that was a typo in your question):
public static <T> Set<T> intersect(Collection<T>... collections) {
//Here is where the magic happens
return (Set<T>) Arrays.stream(collections).reduce(
(a,b) -> {
Set<T> c = new HashSet<>(a);
c.retainAll(b);
return c;
}).orElseGet(HashSet::new);
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With