I've encountered a strange problem in production today. Though I love Guava, I ran into a use case where Guava's Sets.intersection()
was performing pretty badly. I've written a sample code:
Set<Long> cache = new HashSet<>();
for (long i = 0; i < 1000000; i++) {
cache.add(i);
}
Set<Long> keys = new HashSet<>();
for (long i = 0; i < 100; i++) {
keys.add(i);
}
long start = System.currentTimeMillis();
Set<Long> foundKeys = new HashSet<>();
for (Long key : keys) {
if (cache.contains(key)) {
foundKeys.add(key);
}
}
System.out.println("Java search: " + (System.currentTimeMillis() - start));
start = System.currentTimeMillis();
SetView<Long> intersection = Sets.intersection(keys, cache);
System.out.println("Guava search: " + (System.currentTimeMillis() - start));
I've tried to create a similar production scenario where I've a keys cache, and I'm looking for all the keys present in the cache. Strangely, Guava search is taking a lot longer than Java search. After running this I got:
Java search: 0
Guava search: 36
Can anyone tell why this isn't suited for my use case or is there a bug in Guava?
It turns out the problem was multiple calls to SetView.size()
. As SetView
is a (live) view of the intersection of the two sets, the intersection size needs to be recalculated every time.
public static <E> SetView<E> intersection( final Set<E> set1, final Set<?> set2) {
//...
return new SetView<E>() {
@Override public Iterator<E> iterator() {
return Iterators.filter(set1.iterator(), inSet2);
}
@Override public int size() {
return Iterators.size(iterator());
}
//...
};
}
As can be seen here, what recalculation means in this case is iterating across the entire view, which can be quite time-consuming.
The way to get around this is therefore either to make sure size()
is only called once and the value is stored (if you know the underlying sets won't change), or if that isn't possible, create a copy of the intersection via ImmutableSet.copyOf()
(for example).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With