Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Approximate Number of CPU Cycles for Various Operations

I am trying to find a reference for approximately how many CPU cycles various operations require.

I don't need exact numbers (as this is going to vary between CPUs) but I'd like something relatively credible that gives ballpark figures that I could cite in discussion with friends.

As an example, we all know that floating point division takes more CPU cycles than say doing a bitshift.

I'd guess that the difference is that the division is around 100 cycles, where as a shift is 1 but I'm looking for something to cite to back that up.

Can anyone recommend such a resource?

like image 919
colordot Avatar asked Apr 23 '10 22:04

colordot


2 Answers

I did a small app to test this. A very approximate app using synthmaker free edition... e is for empty, numbers are very approx cycles

  divide|e:115|10
    mult|e: 48|10
     add|e: 48|10
    subs|e: 50|10
compare>|e: 50|10
     sin|e:135:10

The readings in the cycle analyser vary wildly from 50 to 100, usually single or double of the expected amount, these are figures that represent averages,the cycle analyzer is a very rough tool, but it gives fair results, a workaround user made exponent coded in ASM that calculates both the exp and the base at audio rate for example is around 800 cycles, so I'd say the above figures are close to at least 50 percent. I thought the divide was way more! It seems about twice as much. If you want the file I made to run in SM free version mail me, I was going to save an exe that is why i did it but you cant save in free version silly me! I am not going to code it from square one in version 1.17 :/ ant.stewart at the place yahoo dotty com.

like image 58
ant grobbelar Avatar answered Sep 24 '22 20:09

ant grobbelar


For x86 processors, see Intel® 64 and IA-32 Architectures Optimization Reference Manual, probably Appendix C.

However, it's not in any way easy to figure out how many cycles an instruction takes to execute on a modern x86 processor, as it depends too much on e.g. accessing data in cache,aligned access, whether branch prediction fails, if there's a stall in the instruction pipeline and quite a lot of other things.

like image 28
nos Avatar answered Sep 26 '22 20:09

nos