Suppose that we have a Collection
like this :
Set<Set<Integer>> set = Collections.newSetFromMap(new ConcurrentHashMap<>());
for (int i = 0; i < 10; i++) {
Set<Integer> subSet = Collections.newSetFromMap(new ConcurrentHashMap<>());
subSet.add(1 + (i * 5));
subSet.add(2 + (i * 5));
subSet.add(3 + (i * 5));
subSet.add(4 + (i * 5));
subSet.add(5 + (i * 5));
set.add(subSet);
}
and to process it :
set.stream().forEach(subSet -> subSet.stream().forEach(System.out::println));
or
set.parallelStream().forEach(subSet -> subSet.stream().forEach(System.out::println));
or
set.stream().forEach(subSet -> subSet.parallelStream().forEach(System.out::println));
or
set.parallelStream().forEach(subSet -> subSet.parallelStream().forEach(System.out::println));
so, can someone please explain me :
What is the difference between them?
Think of it as like two nested loops.
The forth case isn't clear as there is only one thread pool in reality and if the pool is busy the current thread can be used, ie it might not be parallel^2 at all.
Which one is better? faster? and safer?
The first one, however using a flat map would be simpler again.
set.stream().flatMap(s -> s.stream()).forEach(System.out::println);
The other versions are more complicated and since the console, which is the bottle neck, is a shared resource, the multi-threaded version are likely to be slower.
Which one is good for huge collections?
Assuming your aim is to do something other than print, you want to enough tasks to keep all your CPUs busy, but not so many tasks it creates overhead. The second option might be worth considering.
Which one is good when we want to apply heavy processes to each item?
Again the second example, might be best, or possibly the third if you have a small number of outer collections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With