Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I get a performance penalty when mixing SIMD instructions and multithreading

I was interested in doing a proyect about face-recognition (to make use of SIMD instructions set). But during the first semester of the current year, I learnt something about threads and I was wondering if I could combine them.

When should I avoid combining multithreading and SIMD instructions? When is it worth it to do it?

like image 597
A.J. Avatar asked Nov 08 '11 04:11

A.J.


People also ask

Is SIMD multithreaded?

Single instruction, multiple threads (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading. It is different from SPMD in that all instructions in all "threads" are executed in lock-step.

Does multithreading improve performance on single core?

Even on a single-core platform, multithreading can boost the performance of such applications because individual threads are able to perform IO (causing them to block), while others within the same process continue to run.

Is it a good idea to split the data using multiple threads?

Splitting the computation in your program into multiple threads to run multiple tasks at the same time can improve performance, but it also adds complexity. Because threads can run simultaneously, there's no inherent guarantee about the order in which parts of your code on different threads will run.

What is multithreaded performance?

Multithreading is a model of program execution that allows for multiple threads to be created within a process, executing independently but concurrently sharing process resources. Depending on the hardware, threads can run fully parallel if they are distributed to their own CPU core.


3 Answers

Saving x87/MMX/XMM/YMM registers can take quite some time and cause significant cache thrash. Normally, saving and restoring of FP state is done in a lazy manner: upon a context switch, the kernel remembers the current thread as the "owner" of the FP state and sets the TS flag in CR0 and - this will cause a trap to the kernel whenever a thread attempts to execute an FP insn. The FP state of the old thread and the FP state of the currently executing thread are saved and restored, respectively, at that time.

Now, if for extended periods of time (several or many context switches) no other thread than yours uses FP insns - the lazy policy will cause no FP state to be saved/restored whatsoever and you won't get performance hit.

Since we're obviously talking about multiprocessor system, the threads, which execute your algorithm in parallel won't conflict with each other because they should execute on their own CPU/core/HT and have a private set of registers.

tl;dr

You shouldn't be concerned with the overhead of saving and restoring FP registers.

like image 127
chill Avatar answered Oct 23 '22 18:10

chill


Why do you think there would be a problem? SIMD registers will be swapped out like any other CPU registers when a thread change occurs.

like image 34
O'Rooney Avatar answered Oct 23 '22 17:10

O'Rooney


There aren't any new issues to worry about with multithreading and SIMD. So long as you're doing the SIMD correctly and efficiently, you shouldn't have anything to worry about.

Meaning SIMD has it's own implementation challenges, as does multithreading. But combining them won't make either more complex.

like image 20
Kyle Avatar answered Oct 23 '22 17:10

Kyle