Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Arrays.sort and Arrays.parallelSort function behavior

Tags:

java

java-8

I have the following code ,

import java.util.Arrays;


public class ParellelStream {

    public static void main(String args[]){
        Double dbl[] = new Double[1000000];
        for(int i=0; i<dbl.length;i++){
            dbl[i]=Math.random();
        }

        long start = System.currentTimeMillis();
        Arrays.parallelSort(dbl);
        System.out.println("time taken :"+((System.currentTimeMillis())-start));

    }

}

When I run this code it takes time approx 700 to 800 ms, but when I replace the line Arrays.parallelSort to Arrays.sort it takes 500 to 600 ms. I read about the Arrays.parallelSort and Arrays.sort method which says that Arrays.parellelSort gives poor performance when dataset are small but here I am using array of 1000000 elements. what could be the reason for parallelSort poor performance ?? I am using java8.

like image 917
Rakesh Chouhan Avatar asked May 05 '15 13:05

Rakesh Chouhan


2 Answers

The parallelSort function will use a thread for each cpu core you have on your machine. Specifically parallelSort runs tasks on the ForkJoin common thread pool. If you only have one core you would not see an improvement over single threaded sort.

If you only have multiple cores you are going to have some upfront cost associated with creating the new threads which will mean that for relatively small arrays you are not going to see linear performance gains.

The compare function for comparing doubles is not an expensive function. I think that in this case 1000000 elements can be safely considered small and the benefits of using multiple threads is outweighed by the upfront costs of creating those threads. Since the upfront costs will be fixed you should see a performance gain with larger arrays.

like image 126
bhspencer Avatar answered Sep 20 '22 11:09

bhspencer


I read about the Arrays.parallelSort and Arrays.sort method which says that Arrays.parellelSort gives poor performance when dataset are small but here I am using array of 1000000 elements.

This is not the only thing to take in consideration. It depends a lot on your machine (how your CPU handle multi-threading etc).

Here a quote from the Parallelism part of The Java Tutorials

Note that parallelism is not automatically faster than performing operations serially, although it can be if you have enough data and processor cores [...] it is still your responsibility to determine if your application is suitable for parallelism.

You might also want to have a look at the code of java.util.ArraysParallelSortHelpers for a better understanding of the algorithm.

Note that the parallelSort method use the ForkJoinPool introduced in Java 7 to take advantages of each processors of your computer as stated in the javadoc :

A ForkJoinPool is constructed with a given target parallelism level; by default, equal to the number of available processors.

Note that if the length of the array is less then 1 << 13, the array will be sorted using the appropriate Arrays.sort method.

See also

  • Fork/Join
like image 20
Jean-François Savard Avatar answered Sep 19 '22 11:09

Jean-François Savard