Something that's been driving me crazy with python... I used to think it was just Windows, but I was wrong. I can have the same exact code and run it multiple times and it executes in wildly different amounts of time. Take the following test code, for example:
import math
def fib(count):
x = 0
while x < count:
a = int(((((1 + math.sqrt(5)) / 2) ** x) - (((1 - math.sqrt(5)) / 2) ** (x))) / math.sqrt(5))
x+=1
if __name__ == '__main__':
import timeit
t = timeit.Timer("fib(1250)", setup="from __main__ import fib",)
#print t.timeit(10)
count = 10000
results = t.repeat(count, 1)
min = 0xFFFF
max = 0
sum = 0
for i in results:
i = i*1000.0
if i < min: min = i
if i > max: max = i
sum+=i
print "Min {:.3f} | Max {:.3f} | Max/Min {:.3f} | Avg {:.3f}".format(min, max, max/min, sum/count)
Basically, it generates the first 1250 elements of fibonacii 10,000 times and uses timeit to get the amount of time each run takes. I then coalesce those times and find min, max, average and variance between min and max (the spread, if you will).
Here's the results:
Windows: Min 3.071 | Max 8.903 | Max/Min 2.899 | Avg 3.228
Mac OS: Min 1.531 | Max 3.167 | Max/Min 2.068 | Avg 1.621
Ubuntu: Min 1.242 | Max 10.090 | Max/Min 8.123 | Avg 1.349
So, Linux is the fastest but also has the most variance. By a LOT. But all of them can have a pretty wild swing: Only 200% for Mac, but 290% for Windows and 810% for linux!
Is it actually taking that much different time to execute? Is timeit not accurate enough? Is there something else I am missing? I'm working a lot with generating animations and I need as consistent time as possible.
You are measuring very short times, and then even a little bit of something happening somewhere has a big impact.
I ran your test script on my machine (OS X, Core i7, Python 2.7) and made this plot of results
:
You can see that most of the time the timing results are very consistent, but there are isolated incidents of the algorithm taking much more time (because there is something else happening).
I made a tiny adjustment to your timing procedure:
results=t.repeat(10, 1000)
So, now we are timing runs of 1000 function calls. The total amount of time is the same, naturally (10000 calls):
Now you can see that the performance is much more predictable. It may be that part of your wobbly timings are due to the timing methodology, not due really different times to carry out anything. Millisecond-level timing is difficult in a real-world OS environment. Even when your computer is "doing nothing", it is still switching tasks, doing background jobs, etc.
I understand the original point was not to calculate the Fibonacci numbers. But if it were, then choosing the right tool makes a difference:
import numpy as np
def fib(count):
x = np.arange(count)
a = (((1 + np.sqrt(5))/2) ** x - ((1 - np.sqrt(5)) / 2) ** x) / np.sqrt(5)
a = a.astype('int')
This gives:
Min 0.120 | Max 0.471 | Max/Min 3.928 | Avg 0.125
Ten-fold speed improvement.
About the images in this answer, they are plotted with matplotlib
. The first one is done thus:
import matplotlib.pyplot as plt
# create a figure
fig = plt.figure()
# create axes into the figure
ax = fig.add_subplot(111)
# plot the vector results with dots of size 2 (points) and semi-transparent blue color
ax.plot(results, '.', c=(0, 0, 1, .5), markersize=2)
See the documentation of matplotlib
. It is easiest to get started by using IPython
and pylab
.
I made a plot similar to @drV (give him the vote, he was first!) - the difference is that I sorted results so you could see a trend line. The hint is that even though the max is high, the average is low, so there are some outliers at the end.
I used pylab by just adding this to the bottom:
from pylab import *
results.sort()
plot(range(len(results)),results)
show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With