I was reading on threading and learned about fork/join API.
I found that you can either run threads with the commonPool being the default pool managing the threads, or I can submit the threads to a newly created ForkJoinPool.
The difference between the two is as follows, to my understanding:
parallelism
- I'm ignoring the fully qualified system property key name -).Based on the documentation, the commonPool is fine for most uses.
This all boils down to my question:
When should I use the common pool? And why so? When should I create a new pool? And why so?
The answer, like most things in software engineering, is: "It depends".
If you look at this wonderful article:
According to Oracle’s documentation, using the predefined common pool reduces resource consumption, since this discourages the creation of a separate thread pool per task.
and
Using the fork/join framework can speed up processing of large tasks, but to achieve this outcome, some guidelines should be followed:
- Use as few thread pools as possible – in most cases, the best decision is to use one thread pool per application or system
- Use the default common thread pool, if no specific tuning is needed
- Use a reasonable threshold for splitting ForkJoingTask into subtasks
- Avoid any blocking in your ForkJoingTasks
However, there are also some arguments AGAINST following this approach:
Dedicated Pool for Complex Applications
Having a dedicated pool per logical working unit in a complex application is sometimes the preferred approach. Imagine an application that:
So your application has 3 logical work groups each of which might have its own demands for parallelism. (Keep in mind that this pool has parallelism set to something fairly low on most machines)
Better to not step on each other's toes, right? Note that this can scale up to a certain level, where it's recommended to have a separate microservice for each of these work units, but if for one reason or another you are not there already, then a dedicated forkJoinPool per logical work unit is not a bad idea.
Other libraries
If your app's code has only one place where you want parallelism, you don't have a guarantee that some developer wouldn't pull some 3-rd party dependency which also relies on the common ForkJoinPool, and you still have two places where this pool is in demand. That might be okay for your use case, and it might not be, especially if your default pool's parallelism is 4 or below.
Imagine the situation when your app critical code (e.g event handling or saving data to a database) is having to compete for the common pool with some library which exports logs in parallel to some log sink.
Dedicated ForkJoinPool Makes Logging Neater
Additionally, the common forkJoinPool has a rather non-descriptive naming so if you are debugging or looking at logs, chances are you will have to sift through a ton of
ForkJoinPool.commonPool-worker-xx
In the situation described above, compare that with:
ForkJoinPool.grouping-worker-xx
ForkJoinPool.payload-handler-worker-xx
ForkJoinPool.cleanup-worker
Therefore you can see there is some benefit in logging cleanness when using a dedicated ForkJoinPool per logical work group.
Using the common ForkJoinPool has lower memory impact, less resources and thread creation and lower garbage collection demands. However, this approach might be insufficient for some use cases, as pointed above.
Using a dedicated ForkJoinPool per logical work unit in your application provides neater logging, is not a bad idea to use when you have low parallelism level (i.e not many cores), and when you want to avoid thread contention between logically different parts of your application. This, however, comes at a price of higher cpu utilization, higher memory overhead, and more thread creation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With