If you're writing an application which is very latency sensitive what are the limits to embedding assembler within C++ functions (and using the C++ function calls normally), like so:
inline __int64 GetCpuClocks()
{
// Counter
struct { int32 low, high; } counter;
// Use RDTSC instruction to get clocks count
__asm push EAX
__asm push EDX
__asm __emit 0fh __asm __emit 031h // RDTSC
__asm mov counter.low, EAX
__asm mov counter.high, EDX
__asm pop EDX
__asm pop EAX
// Return result
return *(__int64 *)(&counter);
}
(The above function came from another SO post I saw)
Can you treat assembler-inlined functions like a black box? Could you easily retrieve a result from calculations performed in assembler? Are there dangers that you dont know what variables are currently in registers etc? Does it cause more problems than solve, or is it acceptable for specific small tasks?
(Assume your architecture is going to be fixed, and known)
EDIT I just found this, this is what I am hinting at:
http://www.codeproject.com/Articles/15971/Using-Inline-Assembly-in-C-C
EDIT2 This is more aimed towards Linux and x86- its just a general C++/assembler question (or so i thought).
I'd like to answer on the subquestion:
Does it cause more problems than solve, or is it acceptable for specific small tasks?
It certainly does! Using inline assembler, you take the ability from the compiler to optimize the code. It cannot do partial expression substition or any other fancy optimization. It is really, really hard to produce code which is better than what the compiler emits with -O3. And as a bonus, the code gets even better with the next compiler release (presuming that the next compiler release doesn't break it ;) ).
Compilers usually grasp a more wider scope than human brains ever could (or should, to ensure sanity), being able to inline the right function at the right place, to do a partial expression substitution which makes code more efficient. Things you would never do in ASM because your code becomes unreadable as hell.
As an anecdotal reference, I'd like to this post by Linus Torvalds, relating to the git implementation of SHA1, which outperforms the hand-optimized SHA1 in libcrypt.
In fact, I think the only reasonable use of inline assembler nowadays is calling processor instructions which are not available otherwise (the one you quoted is available, on linux for example as clock_gettime
, at least if you're only after a high resolution time counter) or if you have to do things where you need to trick the compiler (for example during implementation of foreign function interfaces).
On the snippet and what others said. Especially with such functions you'll get a performance penalty. In inline asm, you have to be super-careful that the registers are kept in the state the compiler assumes them to be (push/pop, as above). While if you write the code normally, the compiler can take care and keep exactly those variables for which it makes sense in registers and those which do not fit on the stack.
Trust your compiler. It's smart. Most of the time. Invest the time you save by not using inline assembler in thinking about smart, fast algorithms and learning the relevant compiler switches (e.g. to enable SSE optimizations etc.).
If the asm in question is pushing any registers it uses at the top then pops them at the bottom, I think you're safe not to worry about it.
In your example, these are the __asm push EAX
and __asm pop EAX
instructions.
The real answer, I suppose, is that you need to know enough about what the asm does to be sure you can treat it as a black box. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With