Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the limit of optimization using SIMD?

Tags:

c

simd

I need to optimize some C code, which does lots of physics computations, using SIMD extensions on the SPE of the Cell Processor. Each vector operator can process 4 floats at the same time. So ideally I would expect a 4x speedup in the most optimistic case.

Do you think the use of vector operators could give bigger speedups?

Thanks

like image 542
Open the way Avatar asked Sep 05 '10 17:09

Open the way


3 Answers

The best optimization occurs in rethinking the algorithm. Eliminate unnecessary steps. Find more a direct way of accomplishing the same result. Compute the solution in a domain more relevant to the problem.

For example, if the vector array is a list of n which are all on the same line, then it is sufficient to transform the end points only and interpolate the intermediate points.

like image 78
wallyk Avatar answered Sep 21 '22 07:09

wallyk


It CAN give better speeds up than 4 times over straight floating point as the SIMD instructions could be less exact (Not so much as to give too many problems though) and so take fewer cycles to execute. It really depends.

Best plan is to learn as much about the processor you are optimising for as possible. You may find it can give you far better than 4x improvements. You may find out you can't. We can't say though without knowing more about the algorithm you are optimising and what CPU you are targetting.

like image 20
Goz Avatar answered Sep 21 '22 07:09

Goz


On their own, no. But if the process of re-writing your algorithms to support them also happens to improve, say, cache locality or branching behaviour, then you could find unrelated speed-ups. However, this is true of any re-write...

like image 25
Oliver Charlesworth Avatar answered Sep 20 '22 07:09

Oliver Charlesworth