Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is an ExecutorService created via newCachedThreadPool evil?

Paul Tyma presentation has this line:

Executors.newCacheThreadPool evil, die die die

Why is it evil ?

I will hazard a guess: is it because the number of threads will grow in an unbounded fashion. Thus a server that has been slashdotted, would probably die if the JVM's max thread count was reached ?

like image 920
Frankie Ribery Avatar asked May 16 '11 13:05

Frankie Ribery


3 Answers

(This is Paul)

The intent of the slide was (apart from having facetious wording) that, as you mention, that thread pool grows without bound creating new threads.

A thread pool inherently represents a queue and transfer point of work within a system. That is, something is feeding it work to do (and it may be feeding work elsewhere too). If a thread pool starts to grow its because it cannot keep up with demand.

In general, that's fine as computer resources are finite and that queue is built to handle bursts of work. However, that thread pool doesn't give you control over being able to push the bottleneck forward.

For example, in a server scenario, a few threads might be accepting on sockets and handing a thread pool the clients for processing. If that thread pool starts to grow out of control - the system should stop accepting new clients (in fact, the "acceptor" threads then often hop into the thread-pool temporarily to help process clients).

The effect is similar if you use a fixed thread pool with an unbounded input queue. Anytime you consider the scenario of the queue filling out of control - you realize the problem.

IIRC, Matt Welsh's seminal SEDA servers (which are asynchronous) created thread pools which modified their size according to server characteristics.

The idea of stop accepting new clients sounds bad until you realize the alternative is a crippled system which is processing no clients. (Again, with the understanding that computers are finite - even an optimally tuned system has a limit)

Incidentally, JVMs limit threads to 16k (usually) or 32k threads depending on the JVM. But if you are CPU bound, that limit isn't very relevant - starting yet another thread on a CPU-bound system is counterproductive.

I've happily run systems at 4 or 5 thousand threads. But nearing the 16k limit things tend to bog down (this limit JVM enforced - we had many more threads in linux C++) even when not CPU bound.

like image 171
Paul Tyma Avatar answered Sep 18 '22 08:09

Paul Tyma


The problem with Executors.newCacheThreadPool() is that the executor will create and start as many threads as necessary to execute the tasks submitted to it. While this is mitigated by the fact that the completed threads are released (the thresholds are configurable), this can indeed lead to severe resource starvation, or even crash the JVM (or some badly designed OS).

like image 22
Laurent Pireyn Avatar answered Sep 21 '22 08:09

Laurent Pireyn


There are a couple of issues with it. Unbounded growth in terms of threads an obvious issue – if you have cpu bound tasks then allowing many more than the available CPUs to run them is simply going to create scheduler overhead with your threads context switching all over the place and none actually progressing much. If your tasks are IO bound though things get more subtle. Knowing how to size pools of threads that are waiting on network or file IO is much more difficult, and depends a lot on the latencies of those IO events. Higher latencies mean you need (and can support) more threads.

The cached thread pool continues adding new threads as the rate of task production outstrips the rate of execution. There are a couple of small barriers to this (such as locks that serialise new thread id creation) but this can unbound growth can lead to out-of-memory errors.

The other big problem with the cached thread pool is that it can be slow for task producer thread. The pool is configured with a SynchronousQueue for tasks to be offered to. This queue implementation basically has zero size and only works when there is a matching consumer for a producer (there is a thread polling when another is offering). The actual implementation was significantly improved in Java6, but it is still comparatively slow for the producer, particularly when it fails (as the producer is then responsible for creating a new thread to add to the pool). Often it is more ideal for the producer thread to simply drop the task on an actual queue and continue.

The problem is, no-one has a pool that has a small core set of threads which when they are all busy creates new threads up to some max and then enqueues subsequent tasks. Fixed thread pools seem to promise this, but they only start adding more threads when the underlying queue rejects more tasks (it is full). A LinkedBlockingQueue never gets full so these pools never grow beyond the core size. An ArrayBlockingQueue has a capacity, but as it only grows the pool when capacity is reached this doesn't mitigate the production rate until it is already a big problem. Currently the solution requires using a good rejected execution policy such as caller-runs, but it needs some care.

Developers see the cached thread pool and blindly use it without really thinking through the consequences.

like image 41
Jed Wesley-Smith Avatar answered Sep 20 '22 08:09

Jed Wesley-Smith