Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my multithreaded Java program not maxing out all my cores on my machine?

I have a program that starts up and creates an in-memory data model and then creates a (command-line-specified) number of threads to run several string checking algorithms against an input set and that data model. The work is divided amongst the threads along the input set of strings, and then each thread iterates the same in-memory data model instance (which is never updated again, so there are no synchronization issues).

I'm running this on a Windows 2003 64-bit server with 2 quadcore processors, and from looking at Windows task Manager they aren't being maxed-out, (nor are they looking like they are being particularly taxed) when I run with 10 threads. Is this normal behaviour?

It appears that 7 threads all complete a similar amount of work in a similar amount of time, so would you recommend running with 7 threads instead?

Should I run it with more threads?...Although I assume this could be detrimental as the JVM will do more context switching between the threads.

Alternatively, should I run it with fewer threads?

Alternatively, what would be the best tool I could use to measure this?...Would a profiling tool help me out here - indeed, is one of the several profilers better at detecting bottlenecks (assuming I have one here) than the rest?

Note, the server is also running SQL Server 2005 (this may or may not be relevant), but nothing much is happening on that database when I am running my program.

Note also, the threads are only doing string matching, they aren't doing any I/O or database work or anything else they may need to wait on.

like image 930
James B Avatar asked Feb 03 '26 06:02

James B


1 Answers

My guess would be that your app is bottlenecked on memory access, i.e. your CPU cores spend most of the time waiting for data to be read from main memory. I'm not sure how well profilers can diagnose this kind of problem (the profiler itself could influence the behaviour considerably). You could verify the guess by having your code repeat the operations it does many times on a very small data set.

If this guess is correct, the only thing you can do (other than getting a server with more memory bandwidth) is to try and increase the locality of your memory access to make better use of caches; but depending on the details of the application that may not be possible. Using more threads may in fact lead to worse performance because of cores sharing cache memory.

like image 80
Michael Borgwardt Avatar answered Feb 04 '26 18:02

Michael Borgwardt



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!