Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to decide on the ThreadPoolTaskExecutor pools and queue sizes?

Tags:

This is may be more general question, on how to decide on the thread pool size, but let's use the Spring ThreadPoolTaskExecutor for this case. I have the following configuration for the pool core and max size and the queue capacity. I've already read about what all these configurations mean - there is a good answer here.

    @SpringBootApplication     @EnableAsync     public class MySpringBootApp {          public static void main(String[] args) {             ApplicationContext ctx = SpringApplication.run(MySpringBootApp.class, args);         }          @Bean         public TaskExecutor taskExecutor() {             ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();             executor.setCorePoolSize(5);             executor.setMaxPoolSize(10);             executor.setQueueCapacity(25);             return executor;         }      } 

The above numbers look random to me and I want to understand how to set them up correctly based on my environment. I will outline the following constraints that I have:

  1. the application will be running on a two-core CPU box
  2. the executor will work on a task which usually takes about 1-2 seconds to finish.
  3. Usually I expect 800/min tasks to be submitted to my executor, spiking at 2500/min
  4. The task will construct some objects and make an HTTP call to Google pubsub.

Ideally I'd like to understand what other constraints I need to consider and based on them what will be a reasonable configuration for my pools and queue sizes.

like image 262
Anton Belev Avatar asked May 09 '17 15:05

Anton Belev


People also ask

What is max pool size in ThreadPoolTaskExecutor?

One of the added Advantage of using ThreadPoolTaskExecutor of Spring is that it is well suited for management and monitoring via JMX. The default configuration of core pool size is 1, max pool size and queue capacity as 2147483647.

What is Corepoolsize and max pool size in ThreadPoolExecutor?

Starting thread pool size is 1, core pool size is 5, max pool size is 10 and the queue is 100. As requests come in, threads will be created up to 5 and then tasks will be added to the queue until it reaches 100.

What is thread pool queue size?

Thread pool type is fixed with a size of 1 , queue size of 16 .


1 Answers

Update : This answer got a few votes over the years so I'm adding a shortened version for people who don't have the time to read my weird metaphor :

TL;DR answer :

The actual constraint is that a (logical) CPU core can only run a single thread at the same time. Thus :

  • Number of core : Number of logical core of your CPUs * 1/(ratio_of_time_your_thread_is_runnable_when_doing_your_task)

So, if you have 8 logical cores on your machine, you can safely put 8 threads in your threadPool (well, remember to exclude the other threads that may be used). Then you need to ask yourself if you can put more : you need to benchmark the kind of task you intend to run on your threadpool : if you notice the thread are, on average running only 50% of the time, that means your CPU is free to go work on another thread 50% of its time and you can add more threads.

  • Queue size : as many as you can wait on.

The queue size is the number of items your threadPool will accept before rejecting them. It is business logic. It depends on what behavior you expect : is there a point accepting a billion tasks ? When do you throw the towel ? If one task takes one second to complete, and you have 10 threads, that means that the 10,000th task in queue will hopefully be done in 1000 seconds. Is that acceptable ? The worst thing to happen is having clients timeout and re-submit the same tasks before you could complete the firsts.

Original ELI12 answer :

It may not be the most accurate answer, but I'll try :

A simple approach is to be aware that your 2-core CPU will only work on two threads at the same time.

If you have relatively modern Intel CPU, and you have Hyper Threading (aka. HT(TM), HTT(TM), SMT) turned on (via setting in BIOS), your operating system will see the number of available cores as double the number of the physical cores within your CPU.

Either way, from Java to detect how many cores (or simultaneous not-preempting each other threads) you can work with, just call int cores = Runtime.getRuntime().availableProcessors();

If you try to see your application as a Workshop (an actual one) :

  • A processor would be represented by an employee. It is the physical unit that will add value to a product.
  • A task would be a lump of raw material (plus some instructions list)
  • Your thread is a desk on which the employee can put the task on and work.
  • The queue size is the length of the conveyor belt that brings the raw materials to the desk.

Thus, your question becomes "How can I choose how many desks and how long can my conveyor belt be inside my factory, given an unchanging number of employees ?".

For the how many desks (Threads) part :

An employee can only work at one desk at a time, and you can only have a single employee per desk. Thus, the basic setup would be to have at least as many desks as you have employees (to avoid having any employee (Processor) left out without any possibility to work.

But, depending on your activity, you may afford more desks per employee :

If your employees are expected to put mail inside enveloppes constantly, an operation that require their full attention (in programing : sorting collections, creating objects, incrementing counters), having more desks wouldn't help, and may even be detrimental because your employee would have to sometime change desk (switching context, which takes some time), thus leaving the one they were working on, to make work progress on the other.

But, if your task is making pottery, and relies on your employee waiting for the clay to cook in an oven (understand getting access to external resource, such as a file system, a web service etc), your employee can afford to go model clay on another desk and get back to the first one later.

Thus, you can afford more desks per employee as long as your task have a active work/waiting ratio (running/waiting) big enough. And the number of desks being how many tasks can your employee make progress on during the waiting time.

For the conveyor belt (queue) size part :

The queue size represents how many item you are allowing to be queued before starting to reject any more task (by throwing an exception), thus being the threshold at which you start to tell "ok, I'm already overbooked and won't ever be able to comply"

First, I'd say your conveyer belt needs to fit inside the workshop. Meaning that the collection should be small enough to prevent out of memory errors (obviously).

After that, it is based on your company policy. Let's assume a task is added to the belt every time a client makes an order (another service call your API). If the caller doesn't care how much time you take to comply and trust you enough with the execution, there's no point in limiting the size of the belt.

But if you can expect that your client gets annoyed after waiting for their pottery for a month, and leaves you for a concurrent or reorder another pottery, assuming the first order was lost and won't be bothered to ever check if the first order was completed... That first order was done for nothing, you won't get payed, and if your client makes another order whenever you're too slow to comply, you'll enter in a feedback loop because every new order will slow down the whole process.

Thus, in that case, you should put up a sign telling your client "sorry, we're overbooked, you shouldn't make any new order now, as we won't be able to comply within an acceptable time range".

Then, the queue size would be : acceptable time range / time to complete a task.

Concrete Example : if your client service expects that the task it submits would have to be completed in less than 100 seconds, and knowing that every task takes 1-2 seconds, you should limit the queue to 50-100 tasks because once you have 100 tasks waiting in the queue, you're pretty sure that the next one won't be completed in less than 100 seconds, thus rejecting the task to prevent the service from waiting for nothing.

like image 114
Jeremy Grand Avatar answered Sep 19 '22 13:09

Jeremy Grand