Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many CPU cycles one addition take?

I want to measure the number of clock cycles it takes to do an addition operation in Python 3.

I wrote a program to calculate the average value of the addition operation:

from timeit import timeit

def test(n):
    for i in range(n):
      1 + 1

if __name__ == '__main__':

    times = {}
    for i in [2 ** n for n in range(10)]:
      t = timeit.timeit("test(%d)" % i, setup="from __main__ import test", number=100000)
      times[i] = t
      print("%d additions takes %f" % (i, t))

    keys = sorted(list(times.keys()))

    for i in range(len(keys) - 2):
      print("1 addition takes %f" % ((times[keys[i+1]] - times[keys[i]]) / (keys[i+1] - keys[i])))

Output:

16 additions takes 0.288647
32 additions takes 0.422229
64 additions takes 0.712617
128 additions takes 1.275438
256 additions takes 2.415222
512 additions takes 5.050155
1024 additions takes 10.381530
2048 additions takes 21.185604
4096 additions takes 43.122559
8192 additions takes 88.323853
16384 additions takes 194.353927
1  addition takes 0.008292
1 addition takes 0.010068
1 addition takes 0.008654
1 addition takes 0.010318
1 addition takes 0.008349
1 addition takes 0.009075
1 addition takes 0.008794
1 addition takes 0.008905
1 addition takes 0.010293
1 addition takes 0.010413
1 addition takes 0.010551
1 addition takes 0.010711
1 addition takes 0.011035

So according to this output one addition takes approximately 0.0095 usecs. Following this page instructions I calculated that one addition takes 25 CPU cycles. Is this a normal value and why? Because assembly instruction ADD only takes 1-2 CPU cycles.

like image 697
Arsen Avatar asked Mar 31 '16 15:03

Arsen


1 Answers

You are timing a function call (test()), a for loop, and a call to range(). The addition is not being timed at all.

def test(n):
    for i in range(n):
        1 + 1

import dis
dis.dis(test)

Here is the byte code for your test function (does not include the call to test()):

  2           0 SETUP_LOOP              24 (to 27)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_FAST                0 (n)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                10 (to 26)
             16 STORE_FAST               1 (i)

  3          19 LOAD_CONST               2 (2)   **** 
             22 POP_TOP             
             23 JUMP_ABSOLUTE           13
        >>   26 POP_BLOCK           
        >>   27 LOAD_CONST               0 (None)
             30 RETURN_VALUE        

**** Notice, the addition is done at compilation time. Quite a few other languages and their compilers will do that, including C. However, the standards rarely define when a 1 + 1 is actually done, so it is often implementation dependant.

EDIT:

Your timeit function call could be this:

    t = timeit("x += 1", setup="x = 1", number=100000)

We can create a dummy function to check the operation:

def myfunc(x):
    x += 1

import dis
dis.dis(myfunc)

Making that change gives:

1 additions takes 0.008976
2 additions takes 0.007419
4 additions takes 0.007282
8 additions takes 0.007693
16 additions takes 0.007026
32 additions takes 0.007793
64 additions takes 0.010168
128 additions takes 0.008124
256 additions takes 0.009064
512 additions takes 0.007256
1 addition takes -0.001557
1 addition takes -0.000068
1 addition takes 0.000103
1 addition takes -0.000083
1 addition takes 0.000048
1 addition takes 0.000074
1 addition takes -0.000032
1 addition takes 0.000007

 26           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD
              7 STORE_FAST               0 (x)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE

Note that x += 1 is an INPLACE_ADD, different to x = x + 1 which is a BINARY_ADD. So you need to decide which OPCode you want to measure.

like image 121
cdarke Avatar answered Oct 21 '22 08:10

cdarke