Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java thread creation overhead

Conventional wisdom tells us that high-volume enterprise java applications should use thread pooling in preference to spawning new worker threads. The use of java.util.concurrent makes this straightforward.

There do exist situations, however, where thread pooling is not a good fit. The specific example which I am currently wrestling with is the use of InheritableThreadLocal, which allows ThreadLocal variables to be "passed down" to any spawned threads. This mechanism breaks when using thread pools, since the worker threads are generally not spawned from the request thread, but are pre-existing.

Now there are ways around this (the thread locals can be explicitly passed in), but this isn't always appropriate or practical. The simplest solution is to spawn new worker threads on demand, and let InheritableThreadLocal do its job.

This brings us back to the question - if I have a high volume site, where user request threads are spawning off half a dozen worker threads each (i.e. not using a thread pool), is this going to give the JVM a problem? We're potentially talking about a couple of hundred new threads being created every second, each one lasting less than a second. Do modern JVMs optimize this well? I remember the days when object pooling was desirable in Java, because object creation was expensive. This has since become unnecessary. I'm wondering if the same applies to thread pooling.

I'd benchmark it, if I knew what to measure, but my fear is that the problems may be more subtle than can be measured with a profiler.

Note: the wisdom of using thread locals is not the issue here, so please don't suggest that I not use them.

like image 627
skaffman Avatar asked Jan 22 '10 12:01

skaffman


People also ask

How expensive is creating a thread Java?

As you can see, creating a new thread only costs ~70 µs. This could be considered trivial in many, if not most, use cases. Relatively speaking it is more expensive than the alternatives and for some situations a thread pool or not using threads at all is a better solution. That's a great piece of code there.

Why creating thread is expensive in Java?

1 Answer. Java thread creation is expensive because there is a fair bit of work involved: A large block of memory has to be allocated and initialized for the thread stack. System calls need to be made to create / register the native thread with the host OS.

What is overhead in multithreading?

Multithreading still induces high virtualization overhead, mainly caused by synchronization, spinning at user level and NUMA management. The overhead is diverse in nature and embodiment as it is a function of many system and workload properties. System-level solutions are feasible, but often imply difficult trade-offs.

Are threads costly?

Creating a thread is expensive, and the stack requires memory. As well, if your process is using many threads, then context switching can kill performance.


1 Answers

Here is an example microbenchmark:

public class ThreadSpawningPerformanceTest { static long test(final int threadCount, final int workAmountPerThread) throws InterruptedException {     Thread[] tt = new Thread[threadCount];     final int[] aa = new int[tt.length];     System.out.print("Creating "+tt.length+" Thread objects... ");     long t0 = System.nanoTime(), t00 = t0;     for (int i = 0; i < tt.length; i++) {          final int j = i;         tt[i] = new Thread() {             public void run() {                 int k = j;                 for (int l = 0; l < workAmountPerThread; l++) {                     k += k*k+l;                 }                 aa[j] = k;             }         };     }     System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");     System.out.print("Starting "+tt.length+" threads with "+workAmountPerThread+" steps of work per thread... ");     t0 = System.nanoTime();     for (int i = 0; i < tt.length; i++) {          tt[i].start();     }     System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");     System.out.print("Joining "+tt.length+" threads... ");     t0 = System.nanoTime();     for (int i = 0; i < tt.length; i++) {          tt[i].join();     }     System.out.println(" Done in "+(System.nanoTime()-t0)*1E-6+" ms.");     long totalTime = System.nanoTime()-t00;     int checkSum = 0; //display checksum in order to give the JVM no chance to optimize out the contents of the run() method and possibly even thread creation     for (int a : aa) {         checkSum += a;     }     System.out.println("Checksum: "+checkSum);     System.out.println("Total time: "+totalTime*1E-6+" ms");     System.out.println();     return totalTime; }  public static void main(String[] kr) throws InterruptedException {     int workAmount = 100000000;     int[] threadCount = new int[]{1, 2, 10, 100, 1000, 10000, 100000};     int trialCount = 2;     long[][] time = new long[threadCount.length][trialCount];     for (int j = 0; j < trialCount; j++) {         for (int i = 0; i < threadCount.length; i++) {             time[i][j] = test(threadCount[i], workAmount/threadCount[i]);          }     }     System.out.print("Number of threads ");     for (long t : threadCount) {         System.out.print("\t"+t);     }     System.out.println();     for (int j = 0; j < trialCount; j++) {         System.out.print((j+1)+". trial time (ms)");         for (int i = 0; i < threadCount.length; i++) {             System.out.print("\t"+Math.round(time[i][j]*1E-6));         }         System.out.println();     } } } 

The results on 64-bit Windows 7 with 32-bit Sun's Java 1.6.0_21 Client VM on Intel Core2 Duo E6400 @2.13 GHz are as follows:

Number of threads  1    2    10   100  1000 10000 100000 1. trial time (ms) 346  181  179  191  286  1229  11308 2. trial time (ms) 346  181  187  189  281  1224  10651 

Conclusions: Two threads do the work almost twice as fast as one, as expected since my computer has two cores. My computer can spawn nearly 10000 threads per second, i. e. thread creation overhead is 0.1 milliseconds. Hence, on such a machine, a couple of hundred new threads per second pose a negligible overhead (as can also be seen by comparing the numbers in the columns for 2 and 100 threads).

like image 129
Jaan Avatar answered Sep 23 '22 23:09

Jaan