I am trying to benchmark different ways of multithreading a simple java application that transforms each element of an iterator into another.
Which of the below ways (java 8 parallel streams, regular multi-threading with lambda operator) is the most efficient? Based on the below output, it seems parallel streams is as good as traditional multi-threading, am I right?
The output for the below code (you have to replace alice.txt with another file) is:
153407 30420
time in ms - 4826
153407 30420
time in ms - 37908
153407 30420
time in ms - 37947
153407 30420
time in ms - 4839
public class ParallelProcessingExample {
public static void main(String[] args) throws IOException{
String contents = new String(Files.readAllBytes(
Paths.get("impatient/code/ch2/alice.txt")), StandardCharsets.UTF_8);
List<String> words = Arrays.asList(contents.split("[\\P{L}]+"));
long t=System.currentTimeMillis();
Stream<String> wordStream = words.parallelStream().map(x->process(x));
String[] out0=wordStream.toArray(String[]::new);
System.out.println(String.join("-", out0).length()+"\t"+out0.length);
System.out.println("time in ms - "+(System.currentTimeMillis()-t));
t=System.currentTimeMillis();
wordStream = words.stream().map(x->process(x));
String[] out1=wordStream.toArray(String[]::new);
System.out.println(String.join("-", out1).length()+"\t"+out1.length);
System.out.println("time in ms - "+(System.currentTimeMillis()-t));
t=System.currentTimeMillis();
String[] out2=new String[words.size()];
for(int j=0;j<words.size();j++){
out2[j]=process(words.get(j));
}
System.out.println(String.join("-", out2).length()+"\t"+out2.length);
System.out.println("time in ms - "+(System.currentTimeMillis()-t));
t=System.currentTimeMillis();
int n = Runtime.getRuntime().availableProcessors();
String[] out3=new String[words.size()];
try {
ExecutorService pool = Executors.newCachedThreadPool();
for(int i=0;i<n;i++){
int from=i*words.size()/n;
int to=(i+1)*words.size()/n;
pool.submit(() -> {
for(int j=from;j<to;j++){
out3[j]=process(words.get(j));
}
});
}
pool.shutdown();
pool.awaitTermination(1, TimeUnit.HOURS);
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(String.join("-", out3).length()+"\t"+out3.length);
System.out.println("time in ms - "+(System.currentTimeMillis()-t));
}
private static String process(String x) {
try {
TimeUnit.NANOSECONDS.sleep(1);
//Thread.sleep(1); //1000 milliseconds is one second.
} catch(InterruptedException ex) {
Thread.currentThread().interrupt();
}
return x.toUpperCase();
}
}
Java 8 parallel streams can be (in general!) as good as manual multithreading, but it also depends on the concrete situation.
You get RejectedExecutionException because you shut down the pool too early: you should call pool.shutdown() outside the for loop
One big advantage of Java 8 parallel streams is that you don't have to worry about such things.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With