Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gcc intrinsic vs inline assembly : which is better?

If I want to expose a single machine specific instruction to the programmer, there are two ways I can do so :

  1. Define a new builtin / intrinsic
  2. Expose the same as inline assembly asm() [As its a single arithmetic type instruction, I believe there is no need for asm volatile()]

I have read that builtins allow the compiler to take care of the type checking, register allocation and "other optimizations" etc. But the compiler will need to do this even in case of asm (), right ? So what precisely is the performance benefit of using intrinsic over asm () for a single instruction ?

How does the equation change if there are multiple machine instructions involved ?

The "portability" argument in favor of intrinsic is understandable, but I am curious to understand the performance advantage, if any, of one over the other.

like image 463
Cherry Vanc Avatar asked Sep 04 '14 23:09

Cherry Vanc


1 Answers

I think it depends a lot on what you're doing. Modifying GCC, and requiring a modified GCC to build your program unless/until your GCC patch makes it upstream, is a lot more of a headache than just using inline asm.

If the instruction you want to use has an abstract meaning not tied down to a particular instruction set architecture, adding the builtin/intrinsic so that the same code using it could automatically work on all targets (with fallback to a more complex implementation with multiple instructions on targets that don't have the instruction) is probably the "right" choice, but might not be practical still.

If the instruction is something very ISA-specific, obscure, not-performance-critical, etc. (I'm thinking of loading a special hardware register, cpu mode register, getting model info, etc. but I'm sure you can think of other examples) then just using inline asm is almost certainly the right solution.

Even if you do think a builtin is the "right" solution for your problem, but need to take the inline asm approach for practical reasons, you can still abstract it with a macro or static inline function in such a way that it's easy to replace all uses with an intrinsic later (or with a fallback implementation on targets without the instruction).

like image 183
R.. GitHub STOP HELPING ICE Avatar answered Oct 20 '22 08:10

R.. GitHub STOP HELPING ICE