Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inconsistent Execution Time in Python on all systems

Something that's been driving me crazy with python... I used to think it was just Windows, but I was wrong. I can have the same exact code and run it multiple times and it executes in wildly different amounts of time. Take the following test code, for example:

import math
def fib(count):
    x = 0
    while x < count:
        a = int(((((1 + math.sqrt(5)) / 2) ** x) - (((1 - math.sqrt(5)) / 2) ** (x))) / math.sqrt(5))
        x+=1

if __name__ == '__main__':  
    import timeit

    t = timeit.Timer("fib(1250)", setup="from __main__ import fib",)
    #print t.timeit(10)
    count = 10000
    results = t.repeat(count, 1)
    min = 0xFFFF
    max = 0
    sum = 0
    for i in results:
        i = i*1000.0
        if i < min: min = i
        if i > max: max = i
        sum+=i

    print "Min {:.3f} | Max {:.3f} | Max/Min {:.3f} | Avg {:.3f}".format(min, max, max/min, sum/count)

Basically, it generates the first 1250 elements of fibonacii 10,000 times and uses timeit to get the amount of time each run takes. I then coalesce those times and find min, max, average and variance between min and max (the spread, if you will).

Here's the results:

Windows: Min 3.071 | Max 8.903 | Max/Min 2.899 | Avg 3.228
Mac OS:  Min 1.531 | Max 3.167 | Max/Min 2.068 | Avg 1.621
Ubuntu:  Min 1.242 | Max 10.090 | Max/Min 8.123 | Avg 1.349

So, Linux is the fastest but also has the most variance. By a LOT. But all of them can have a pretty wild swing: Only 200% for Mac, but 290% for Windows and 810% for linux!

Is it actually taking that much different time to execute? Is timeit not accurate enough? Is there something else I am missing? I'm working a lot with generating animations and I need as consistent time as possible.

like image 531
Adam Haile Avatar asked Dec 11 '22 04:12

Adam Haile


2 Answers

You are measuring very short times, and then even a little bit of something happening somewhere has a big impact.

I ran your test script on my machine (OS X, Core i7, Python 2.7) and made this plot of results:

enter image description here

You can see that most of the time the timing results are very consistent, but there are isolated incidents of the algorithm taking much more time (because there is something else happening).


I made a tiny adjustment to your timing procedure:

results=t.repeat(10, 1000)

So, now we are timing runs of 1000 function calls. The total amount of time is the same, naturally (10000 calls):

enter image description here

Now you can see that the performance is much more predictable. It may be that part of your wobbly timings are due to the timing methodology, not due really different times to carry out anything. Millisecond-level timing is difficult in a real-world OS environment. Even when your computer is "doing nothing", it is still switching tasks, doing background jobs, etc.


I understand the original point was not to calculate the Fibonacci numbers. But if it were, then choosing the right tool makes a difference:

import numpy as np

def fib(count):
    x = np.arange(count)
    a = (((1 + np.sqrt(5))/2) ** x - ((1 - np.sqrt(5)) / 2) ** x) / np.sqrt(5)
    a = a.astype('int')

This gives:

Min 0.120 | Max 0.471 | Max/Min 3.928 | Avg 0.125

Ten-fold speed improvement.


About the images in this answer, they are plotted with matplotlib. The first one is done thus:

import matplotlib.pyplot as plt

# create a figure
fig = plt.figure()
# create axes into the figure
ax = fig.add_subplot(111)
# plot the vector results with dots of size 2 (points) and semi-transparent blue color
ax.plot(results, '.', c=(0, 0, 1, .5), markersize=2)

See the documentation of matplotlib. It is easiest to get started by using IPython and pylab.

like image 71
DrV Avatar answered Jan 05 '23 07:01

DrV


I made a plot similar to @drV (give him the vote, he was first!) - the difference is that I sorted results so you could see a trend line. The hint is that even though the max is high, the average is low, so there are some outliers at the end.

enter image description here

I used pylab by just adding this to the bottom:

from pylab import *
results.sort()
plot(range(len(results)),results)
show()
like image 21
tdelaney Avatar answered Jan 05 '23 05:01

tdelaney