So, as part of a school assignment, we are being asked to determine what our optimum thread count is for our personal computers by constructing a toy program.
To start, we are to create a task that takes between 20 and 30 seconds to run. I chose to do a coin toss simulation, where the total number of heads and tails are accumulated and then displayed. On my machine, 300,000,000 tosses on one thread ended up at 25 seconds. After that, I went to 2 threads, then 4, then 8, 16, 32, and, just for fun, 100. Here are the results:
* Thread Tosses per thread time(seconds)
* ------------------------------------------
* 1 300,000,000 25
* 2 150,000,000 13
* 4 75,000,000 13
* 8 37,500,000 13
* 16 18,750,000 14
* 32 9,375,000 14
* 100 3,000,000 14
And here is the code I'm using:
void toss()
{
int heads = 0, tails = 0;
default_random_engine gen;
uniform_int_distribution<int> dist(0,1);
int max =3000000; //tosses per thread
for(int x = 0; x < max; ++x){(dist(gen))?++heads:++tails;}
cout<<heads<<" "<<tails<<endl;
}
int main()
{
vector<thread>thr;
time_t st, fin;
st = time(0);
for(int i = 0;i < 100;++i){thr.push_back(thread(toss));} //thread count
for(auto& thread: thr){thread.join();}
fin = time(0);
cout<<fin-st<<" seconds\n";
return 0;
}
Now for the main question:
Past a certain point, I would've expected there to be a considerable decline in computing speed as more threads were added, but the results don't seem to show that.
Is there something fundamentally wrong with my code that would yield these sorts of results, or is this behavior considered normal? I'm very new to multi-threading, so I have a feeling it's the former....
Thanks!
EDIT: I am running this on a macbook with a 2.16 GHz Core 2 Duo (T7400) processor
No, probably not. Anything with a thread count nearing (or above) 1000 thread count is almost certain to be significantly lower quality than sheets with a more reasonable number. Most fabrics with a thread count over 600 are a sign of deceptive marketing tactics at work.
A fabric with a thread count of 200 high-quality fibers can have a better hand, or feeling to the touch, than a thread count of 400 inferior-quality fibers or twisted threads.
Ideally the total thread count for all the jobs should be the number of cores of the system, except on systems that support hyper-threading, in which it should be twice the number of cores. So if the system doesn't have hyper-threading, there are 8 calculations running, each should run in one thread.
Generally, the higher the thread count, the softer the sheet, and the more likely it will wear well — or even soften — over time. Good sheets range anywhere from 200 to 800, although you'll occasionally see numbers over 1,000.
Your results seem very normal to me. While thread creation has a cost, its not that much (especially compared to the per-second granularity of your tests). An extra 100 thread creations, destructions, and possible context-switches isn't going to change your timing by more than a few milliseconds I bet.
Running on my Intel i7-4790 @ 3.60 GHz I get these numbers:
threads - seconds
-----------------
1 - 6.021
2 - 3.205
4 - 1.825
8 - 1.062
16 - 1.128
32 - 1.138
100 - 1.213
1000 - 2.312
10000 - 23.319
It takes many, many more threads to get to the point at which the extra threads make a noticeable difference. Only when I get to 1,000 threads do I see that the thread-management has made a significant difference and at 10,000 it dwarfs the loop (the loop is only doing 30,000 tosses at that point).
As towards your assignment, it should be fairly straightforward to see that the optimal number of threads for your system should be the same as the available threads that can be executed at once. There's not any processing power left to execute another thread until one is either done or yielded, which doesn't help you finish faster. And, any less threads and you aren't using all available resources. My CPU has 8 threads and the chart reflects that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With