I wrote a simple Sieve of Eratosthenes, which uses a list of ones and turns them into zeros if not prime, like so:
def eSieve(n): #Where m is fixed-length list of all integers up to n
'''Creates a list of primes less than or equal to n'''
m = [1]*(n+1)
for i in xrange(2,int((n)**0.5)+1):
if m[i]:
for j in xrange(i*i,n+1,i):
m[j]=0
return [i for i in xrange(2,n) if m[i]]
I tested the speed it ran with %timeit
and got:
#n: t
#10**1: 7 μs
#10**2: 26.6 μs
#10**3: 234 μs
#10**4: 2.46 ms
#10**5: 26.4 ms
#10**6: 292 ms
#10**7: 3.27 s
I assumed, if I changed [1]
and 0
to booleans, it would run faster... but it does the opposite:
#n: t
#10**1: 7.31 μs
#10**2: 29.5 μs
#10**3: 297 μs
#10**4: 2.99 ms
#10**5: 29.9 ms
#10**6: 331 ms
#10**7: 3.7 s
Why are the booleans slower?
This happens because True
and False
are looked up as globals in Python 2. The 0
and 1
literals are just constants, looked up by a quick array reference, while globals are dictionary lookups in the global namespace (falling through to the built-ins namespace):
>>> import dis
>>> def foo():
... a = True
... b = 1
...
>>> dis.dis(foo)
2 0 LOAD_GLOBAL 0 (True)
3 STORE_FAST 0 (a)
3 6 LOAD_CONST 1 (1)
9 STORE_FAST 1 (b)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
The True
value is looked up with the LOAD_GLOBAL
bytecode, while the 1
literal value is copied to the stack with LOAD_CONST
.
If you make True
and False
locals you can make them just as fast again:
def eSieve(n, True=True, False=False):
m = [True]*(n+1)
for i in xrange(2,int((n)**0.5)+1):
if m[i]:
for j in xrange(i*i,n+1,i):
m[j]=False
return [i for i in xrange(2,n) if m[i]]
Assigning True
and False
as default values to for arguments gives the function those names as locals, with the exact same values; again using a simplified version:
>>> def bar(True=True, False=False):
... True == False
...
>>> dis.dis(bar)
2 0 LOAD_FAST 0 (True)
3 LOAD_FAST 1 (False)
6 COMPARE_OP 2 (==)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
Note the LOAD_FAST
opcodes, now with indices just like the LOAD_CONST
bytecodes; locals in a CPython function are stored in an array just like bytecode constants.
With that change, using booleans wins out, albeit by a small margin; my timings:
# n integers globals locals
# 10**1 4.31 µs 4.2 µs 4.2 µs
# 10**2 17.1 µs 17.3 µs 16.5 µs
# 10**3 147 µs 158 µs 144 µs
# 10**4 1.5 ms 1.66 ms 1.48 ms
# 10**5 16.4 ms 18.2 ms 15.9 ms
# 10**6 190 ms 215 ms 189 ms
# 10**7 2.21 s 2.47 s 2.18 s
The difference isn't really that much because Python booleans are just an int
subclass.
Note that in Python 3, True
and False
have become keywords and can no longer be assigned to, making it possible to treat them just like integer literals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With