In short:
Under what scenarios can running a multithreaded app on a single core destroy performance?
What about setting the affinity of a multithreaded app to only use one core?
In long:
I'm trying to run the physics of a 2D engine on it's own thread. It works, and at first performance seemed normal, but I decided to tell the game to try and run at 10K FPS and the physics at 120FPS, went into Task Manager and set the affinity to where the program could only use one core.
The FPS was at ~1700 before setting the affinity to one core, afterwards it went to ~70FPS. I didn't expect that kind of decrease. I told the game to try and run at 300 FPS and physics at 60FPS.
Same thing happened.
I didn't give it much thought, so I just continued modifying the engine. I tested it again later after changing some of the drawing code, 300 FPS, 60FPS for physics. With all cores allowed it managed 300FPS just fine, with affinity to a single core FPS dropped down to 4. Now I know it can't possibly be that bad running a multithreaded app on a single core or am I ignorant of something that happens when you set the affinity to a single core.
This is about how the rendering/physics runs...
Loop starts
Gather input until (1.0 / FPS) has passed.
Call update.
Lock physics thread mutex because things in the game will be using the physics data and I don't want the engine updating anything until everything in this update call finishes.
Update everything in the game which may send a Draw function object(Holds what to draw, where to draw, how to draw) to the Render queue.
Unlock mutex.
Renderer calls operator() on each function object and removes them from queue.
update screen.
repeat loop.
Physics thread loop:
ALLEGRO_TIMER* timer(al_create_timer(1.0f / 60.0f));
double prevCount(0);
al_start_timer(timer);
while(true)
{
auto_mutex lock(m_mutex);
if(m_shutdown)
break;
if (!m_allowedToStep)
continue;
// Don't run too fast. This isn't final, just simple test code.
if (!(al_get_timer_count(timer) > prevCount))
continue;
prevCount = al_get_timer_count(timer);
m_world->Step(1.0f / 60.0f, 10, 10);
m_world->ClearForces();
}
// Note: Auto mutex is just a really simple object I created to lock a mutex in constructor and unlock it in destructor. I'm using Allegro 5's threading functions.
Under what scenarios can running a multithreaded app on a single core destroy performance?
What about setting the affinity of a multithreaded app to only use one core?
In both cases, the answer is much the same. If your program is running on a single core, then only one thread runs at a time. And that means that any time one thread has to wait for another, you need the OS to perform a context switch, which is a fairly expensive operation.
When run on multiple cores, there's a decent chance that the two threads that need to interact will both be running simultaneously, and so the OS won't need to perform a context switch for your code to proceed.
So really, code which requires a lot of inter-thread synchronization is going to run slower on a single core.
But you can make it worse. A spinlock, or any kind of busy-waiting loop will absolutely destroy performance. And it should be obvious why. You can only run one thread at a time, so if you need a thread to wait for some event, you should tell the OS to put it to sleep immediately, so that another thread can run.
If instead you just do some "while condition is not met, keep looping" busy loop, you're keeping the thread running, even though it has nothing to do. It'll continue looping *until the OS decides that its time is up, and it schedules another thread. (And if the thread doesn't get blocked by something, it'll typically be allowed to run for upwards of 10 milliseconds at a time.)
In multithreaded programming in general, and *especially multithreaded code running on a single core, you need to play nice, and not hog the CPU core more than necessary. If you have nothing sensible to do, allow another thread to run.
And guess what your code is doing.
What do you think the effect of these lines is?
if (!(al_get_timer_count(timer) > prevCount))
continue;
RUN THE LOOP! AM I READY TO RUN YET? NO? THEN RUN THE LOOP AGAIN. AM I READY TO RUN NOW? STILL NO? RUN THE LOOP AGAIN.....
In other words, "I have the CPU now, AND I WILL NEVER SURRENDER! IF SOMEONE ELSE WANTS THE CPU THEY'LL HAVE TO TAKE IT FROM MY COLD DEAD BODY!"
If you have nothing to use the CPU for, then give it up, especially if you have another thread that is ready to run.
Use a mutex or some other synchronization primitive, or if you're ok with a more approximate time-based sleep period, call Sleep()
.
But don't, if you want any kind of decent performance, hog the CPU indefinitely, if you're waiting for another thread to do some processing.
When you look at a processor, don't look at it like a block that just calculates one thing after another. look at it as a calculator you have to reserve a time for
Windows (and all operating systems) makes sure this time is reserved for all running apps. When you run a program, the computer doesn't just do all the calculations the new program wants to, windows allocates the program a specific amount of time. When that time is over, the next program gets some time. windows does all this for you, so it's only relevant if you want to understand it.
This does effect how you look at multi-threading though, because when windows looks around and sees a multi-threaded application, it will say "i will handle this as 2 separate programs" so it allocates time for both. therefore one wont completely stop the other from doing calculations.
So no, it will not TANK performance of your program to run it multi-threaded, but it will makes other programs around it a little slower. and create a small amount of overhead. but if you are doing large calculations and it is causing your program to hang, feel free to multi-thread.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With