Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detailed difference between Java8 ForkJoinPool and Executors.newWorkStealingPool?

What is the low-level difference among using:

ForkJoinPool = new ForkJoinPool(X);

and

ExecutorService ex = Executors.neWorkStealingPool(X);

Where X is the desired level of parallelism i.e threads running..

According to the docs I found them similar. Also tell me which one is more appropriate and safe under any normal uses. I have 130 million entries to write into a BufferedWriter and Sort them using Unix sort by 1st column.

Also let me know how many threads to keep if possible.

Note: My System has 8 core processors and 32 GB RAM.

like image 716
bit_cracker007 Avatar asked Dec 27 '16 00:12

bit_cracker007


People also ask

What is the main difference between the executor framework and ForkJoinPool?

In short, the main difference between the Executor framework and ForkJoinPool is that the former provides a general-purpose thread pool, while the latter provides a special implementation that uses a work-stealing pattern for efficient processing of ForkJoinTask.

How many threads are there in ForkJoinPool?

Implementation notes: This implementation restricts the maximum number of running threads to 32767. Attempts to create pools with greater than the maximum number result in IllegalArgumentException .

What is a ForkJoinPool?

ForkJoinPool It is an implementation of the ExecutorService that manages worker threads and provides us with tools to get information about the thread pool state and performance. Worker threads can execute only one task at a time, but the ForkJoinPool doesn't create a separate thread for every single subtask.

What is work stealing pool?

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors (or cores).


3 Answers

Work stealing is a technique used by modern thread-pools in order to decrease contention on the work queue.

A classical threadpool has one queue, and each thread-pool-thread locks the queue, dequeue a task and then unlocks the queue. If the tasks are short and there are many of them, there is a lot of contention on the queue. Using a lock-free queue really helps here, but doesn't solve the problem entirely.

Modern thread pools use work stealing - each thread has its own queue. When a threadpool thread produces a task - it enqueues it to his own queue. When a threadpool thread wants to dequeue a task - it first tries to dequeue a task out of his own queue and if it doesn't have any - it "steals" work from other thread queues. This really decreases the contention of the threadpool and improves performance.

newWorkStealingPool creates a workstealing-utilizing thread pool with the number of threads as the number of processors.

newWorkStealingPool presents a new problem. If I have four logical cores, then the pool will have four threads total. If my tasks block - for example on synchronous IO - I don't utilize my CPUs enough. What I want is four active threads at any given moment, for example - four threads which encrypt AES and another 140 threads which wait for the IO to finish.

This is what ForkJoinPool provides - if your task spawns new tasks and that task waits for them to finish - the pool will inject new active threads in order to saturate the CPU. It is worth mentioning that ForkJoinPool utilizes work stealing too.

Which one to use? If you work with the fork-join model or you know your tasks block indefinitely, use the ForkJoinPool. If your tasks are short and are mostly CPU-bound, use newWorkStealingPool.

And after anything has being said, modern applications tend to use thread pool with the number of processors available and utilize asynchronous IO and lock-free-containers in order to prevent blocking. this (usually) gives the best performance.

like image 115
David Haim Avatar answered Sep 23 '22 03:09

David Haim


newWorkStealingPool is a higher level of abstraction for ForkJoinPool.

If you look at the Oracle jvm implementation, it's simply a preconfigured ForkJoinPool:

public static ExecutorService newWorkStealingPool() {
    return new ForkJoinPool(Runtime.getRuntime().availableProcessors(),
                            ForkJoinPool.defaultForkJoinWorkerThreadFactory,
                            null, 
                            true);
}

Unfortunately looking at implementations isn't a proper way for understanding purpose of a class though.

Also credit to: https://dzone.com/articles/diving-into-java-8s-newworkstealingpools

like image 45
Yamcha Avatar answered Sep 22 '22 03:09

Yamcha


It's only a abstraction for the Fork/Join Framework...

/**
* Creates a work-stealing thread pool using all
* {@link Runtime#availableProcessors available processors}
* as its target parallelism level.
* @return the newly created thread pool
* @see #newWorkStealingPool(int)
* @since 1.8
*/
public static ExecutorService newWorkStealingPool() {
    return new ForkJoinPool(Runtime.getRuntime().availableProcessors(),
                            ForkJoinPool.defaultForkJoinWorkerThreadFactory,
                            null, true);
}
like image 22
Rafael Bernabeu Avatar answered Sep 23 '22 03:09

Rafael Bernabeu