I have to write a not-so-large program in C++, using boost::thread.
The problem at hand, is to process a large (maybe thousands or tens of thousands. Hundreds and millons are a possibility as well) number of (possibly) large files. Each file is independent from another, and they all reside in the same directory. I´m thinking of using the multi threaded aproach, but the question is, how many threads should I use? I mean, what order of magnitude? 10, 500, 12400?
There are some synchronization issues, each thread should return a struct of values (which are accumulated for each file), and those are added to a "global" struct to get the overall data. I realize that some threads could "get hungry" because of synchronization, but if it's only an add operation, does it matter?
I was thinking of
for(each file f in directory){
if (N < max_threads)//N is a static variable controlling amount of threads
thread_process(f)
else
sleep()
}
This is in HP - UX, but I won't be able to test it often, since it's a remote and quite unaccessible server.
"Is there such a thing as too many threads?" - Yes. Threads consume system resources that you may run out of. Threads need to be scheduled; requires work by the kernel as well as time on the CPU (even if they then deside to do nothing).
If your thread usage peaks at 3, then 100 is too much. If it remains at 100 for most of the day, bump it up to 200 and see what happens. You could actually have your code itself monitor usage and adjust the configuration for the next time it starts but that's probably overkill.
Ideally the total thread count for all the jobs should be the number of cores of the system, except on systems that support hyper-threading, in which it should be twice the number of cores. So if the system doesn't have hyper-threading, there are 8 calculations running, each should run in one thread.
4.2. Windows. On Windows machines, there's no limit specified for threads. Thus, we can create as many threads as we want, until our system runs out of available system memory.
According to Amdahl's law that was discussed by Herb Sutter in his article:
Some amount of a program's processing is fully "O(N)" parallelizable (call this portion p), and only that portion can scale directly on machines having more and more processor cores. The rest of the program's work is "O(1)" sequential (s). [1,2] Assuming perfect use of all available cores and no parallelization overhead, Amdahl's Law says that the best possible speedup of that program workload on a machine with N cores is given by
In your case I/O operations could take most of the time, as well as synchronization issues. You could count time that will be spend in blocking(?) slow I/O operations and approximately find number of threads that will be suitable for your task.
Full list of concurrency related articles by Herb Sutter could be found here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With