Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Performance Processes vs Threads

I am implementing a worker pool in Java.

This is essentially a whole load of objects which will pick up chunks of data, process the data and then store the result. Because of IO latency there will be significantly more workers than processor cores.

The server is dedicated to this task and I want to wring the maximum performance out of the hardware (but no I don't want to implement it in C++).

The simplest implementation would be to have a single Java process which creates and monitors a number of worker threads. An alternative would be to run a Java process for each worker.

Assuming for arguments sake a quadcore Linux server which of these solutions would you anticipate being more performant and why?

You can assume the workers never need to communicate with one another.

like image 227
Nick Long Avatar asked Oct 26 '11 13:10

Nick Long


1 Answers

One process, multiple threads - for a few reasons.

When context-switching between jobs, it's cheaper on some processors to switch between threads than between processes. This is especially important in this kind of I/O-bound case with more workers than cores. The more work you do between getting I/O blocked, the less important this is. Good buffering will pay for threads or processes, though.

When switching between threads in the same JVM, at least some Linux implementations (x86, in particular) don't need to flush cache. See Tsuna's blog. Cache pollution between threads will be minimized, since they can share the program cache, are performing the same task, and are sharing the same copy of the code. We're talking savings on the order of 100's of nanoseconds to several microseconds per switch. If that's small potatoes for you, then read on...

Depending on the design, the I/O data path may be shorter for one process.

The startup and warmup time for a thread is generally much shorter. The OS doesn't have to start a process, Java doesn't have to start another JVM, classloading is only done once, JIT-compilation is only done once, and HotSpot optimizations are done once, and sooner.

like image 191
Ed Staub Avatar answered Oct 16 '22 00:10

Ed Staub