So I was just learning new Java 8, specially lambdas and date and time api. I was comparing it with scala. My basic idea was to find the execution time difference between imperative, Stream and parallel stream. So I decided to create a Library application and do some operations like searching, filtering, sorting etc. I created a Library class with a list field called books and populated it with 1000 books. Then created a functional interface for search and did some operation in all three styles. It all worked fine. My code is:
// Functional Interface interface Search<T> { public void search(T t); } // Library class final Library library = new Library(); // This just creates some random book objects. final List<Book> books = collectBooks(); final Search<List<Book>> parallelSearch = (bks) -> library.findAndPrintBooksParallel(bks); // Parallel Operations private void findAndPrintBooksParallel(List<Book> books) { books.parallelStream() .filter(b -> b.getAuthor().equals("J.K. Rowling")) .sorted((x,y) -> x.getAuthor().compareTo(y.getAuthor())) .map(Book::getIsbn) .forEach(Library::waitAndPrintRecord); }
Now I tried to re-create the same program in scala and see whether the execution is faster or not? Surprisingly scala didn't allow me to do parallel sorting (Or may be I'm being ignorant here). My scala library is
// Again some random book objects as a list val books = collectBooks // Parallel operation books.par filter(_.author == "J.K. Rowling") map (_.isdn) foreach waitAndPrint
Here the books.par
gives a ParSeq. This doesn't have a sort method. Is there any way I could create a parallel sort with my books list in scala. So I could write something like:
books.par filter(_.author == "J.K. Rowling") sortWith (_.author < _.author) map (_.isdn) foreach waitAndPrint
Your help is much appreciated. Thanks.
Java 8 introduced a new method called as parallelSort() in java.util.Arrays Class. It uses Parallel Sorting of array elements. Algorithm of parallelSort() 1. The array is divided into sub-arrays and that sub-arrays is again divided into their sub-arrays, until the minimum level of detail in a set of array.
The parallelSort() is functionally different. Unlike sort(), which sorts data sequentially using a single thread, it uses a parallel sort-merge sorting algorithm. It breaks the array into sub-arrays that are themselves sorted and then merged. For executing parallel tasks it uses the ForkJoin pool.
Merge Sort uses recursion to solve the problem of sorting more efficiently than algorithms previously presented, and in particular it uses a divide and conquer approach.
One can implement a parallel sort in scala e.g. http://blog.yunglinho.com/blog/2013/03/19/parallel-external-merge-sort/ . I don't know why ParSeq doesn't offer sorting in its API. Note that there are a number of alternatives to .par
, so don't necessarily assume that your results from ParSeq are representative of Scala parallelism in general. (But benchmarks are valuable and I wish you well with yours).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With