Here is a (slightly messy) attempt at Project Euler Problem 49.
I should say outright that the deque
was not a good choice! My idea was that shrinking the set of primes to test for membership would cause the loop to accelerate. However, when I realised that I should have used a set
(and not worry about removing elements), I got a 60x speed-up.
from collections import deque
from itertools import permutations
from .sieve import sieve_of_erastothenes # my own implementation of the Sieve of Erastothenes
primes = deque(prime for prime in sieve_of_erastothenes(10000) if prime > 1000 and prime != 1487) # all four-digit primes except 1487
try:
while True:
prime = primes.popleft() # decrease the length of primes each time to speed up membership test
for inc in xrange(1,10000 + 1 - (2 * prime)): # this limit ensures we don't end up with results > 10000
inc1 = prime + inc
inc2 = prime + 2*inc
if inc1 in primes and inc2 in primes:
primestr = str(prime)
perms = set(''.join(tup) for tup in permutations(primestr)) # because permutations() returns tuples
inc1str = str(inc1)
inc2str = str(inc2)
if inc1str in perms and inc2str in perms:
print primestr + inc1str + inc2str
raise IOError # I chose IOError because it's unlikely to be raised
# by anything else in the block. Exceptions are an easy
# way to break out of nested loops.
except IOError:
pass
Anyway, before I thought to use a set
, I tried it out in Pypy. I found the results to be rather suprising:
$ time python "problem49-deque.py"
296962999629
real 1m3.429s
user 0m49.779s
sys 0m0.335s
$ time pypy-c "problem49-deque.py"
296962999629
real 5m52.736s
user 5m15.608s
sys 0m1.509s
Why is Pypy over five times slower on this code? I would guess that Pypy's version of the deque
is the culprit (because it runs faster on the set
version), but I have no idea why that is.
Deques are sequence-like data types designed as a generalization of stacks and queues. They support memory-efficient and fast append and pop operations on both ends of the data structure.
One way to implement a deque is a doubly-linked list, also known as a “head-tail linked list”. Each node in a doubly-linked list has a reference to the previous node in the list as well as the next element, which I will call left and right .
deque uses a linked list as part of its data structure. This is the kind of linked list it uses. With doubly linked lists, deque is capable of inserting or deleting elements from both ends of a queue with constant O(1) performance.
The slow part is inc1 in primes and inc2 in primes
. I'll look at why PyPy is so slow (thanks for the performance bug report, basically). Note that as you mentioned the code can be made incredibly faster (both on PyPy and on CPython) --- in this case, just by copying the primes
deque into a set just before the for
loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With