we are currently trying to tweak performance by using multithreading in our java app. We have a long running serial task which we would like to split to multi CPU cores.
Basically we have list with let's say 100.000 items / things to do.
My question now is is it better to do:
Option 1 (Pseudocode):
for(i = 0; i < 100000; i++){
threadpool.submit(new MyCallable("1 thing to do"))
}
This would add 100000 runnables/callables to the queue of the threadpool (current LinkedBlockingQueue)
or is it better to do: Option 2 (Pseudocode)
for(i = 0; i < 4; i++){
threadpool.submit(new MyCallable("25000 things to do"))
}
We have tried already option 1, and we didn't notice any performance improvement, although we can clearly see that multiple threads are working like crazy and also 4 CPU cores used. But my feeling is that there is some overhead in option 1 because of the many tasks. We haven't tried option 2 yet, but my feeling is, that it could speed up things as there is less overhead. We are basically splitting the list into 4 larger chunks instead 100000 single items.
Any thoughts on this?
Thanks
What matters is that you minimize the amount of context switching, and maximize the amount of work per task that it spends computing. As a practical matter, if your tasks are computing, exceeding the number of physical CPUs isn't going to help. If your tasks actually do a lot of I/O and I/O waits, you want to have many of them so there is always a bunch of "ready" tasks available when one blocks.
If you really have 25000 things to do, and the things are computation, I'd probably set up 32 threads (more CPUs than you have, but not a lot of extra overhead) and parcel out 10-50 units of work to each one if those units are relatively small.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With