Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is Pypy's deque so slow?

Tags:

python

pypy

deque

Here is a (slightly messy) attempt at Project Euler Problem 49.

I should say outright that the deque was not a good choice! My idea was that shrinking the set of primes to test for membership would cause the loop to accelerate. However, when I realised that I should have used a set (and not worry about removing elements), I got a 60x speed-up.

from collections import deque
from itertools import permutations
from .sieve import sieve_of_erastothenes  # my own implementation of the Sieve of Erastothenes

primes = deque(prime for prime in sieve_of_erastothenes(10000) if prime > 1000 and prime != 1487)  # all four-digit primes except 1487
try:
    while True:
        prime = primes.popleft()  # decrease the length of primes each time to speed up membership test
        for inc in xrange(1,10000 + 1 - (2 * prime)):  # this limit ensures we don't end up with results > 10000
            inc1 = prime + inc
            inc2 = prime + 2*inc

            if inc1 in primes and inc2 in primes:
                primestr = str(prime)
                perms = set(''.join(tup) for tup in permutations(primestr))  # because permutations() returns tuples
                inc1str = str(inc1)
                inc2str = str(inc2)
                if inc1str in perms and inc2str in perms:
                    print primestr + inc1str + inc2str
                    raise IOError  # I chose IOError because it's unlikely to be raised
                                   # by anything else in the block. Exceptions are an easy
                                   # way to break out of nested loops.
except IOError:
    pass

Anyway, before I thought to use a set, I tried it out in Pypy. I found the results to be rather suprising:

$ time python "problem49-deque.py"
296962999629

real    1m3.429s
user    0m49.779s
sys 0m0.335s

$ time pypy-c "problem49-deque.py"
296962999629

real    5m52.736s
user    5m15.608s
sys 0m1.509s

Why is Pypy over five times slower on this code? I would guess that Pypy's version of the deque is the culprit (because it runs faster on the set version), but I have no idea why that is.

like image 888
Benjamin Hodgson Avatar asked Nov 16 '12 17:11

Benjamin Hodgson


People also ask

Is deque fast?

Deques are sequence-like data types designed as a generalization of stacks and queues. They support memory-efficient and fast append and pop operations on both ends of the data structure.

How are Deques implemented Python?

One way to implement a deque is a doubly-linked list, also known as a “head-tail linked list”. Each node in a doubly-linked list has a reference to the previous node in the list as well as the next element, which I will call left and right .

Is deque in python a linked list?

deque uses a linked list as part of its data structure. This is the kind of linked list it uses. With doubly linked lists, deque is capable of inserting or deleting elements from both ends of a queue with constant O(1) performance.


1 Answers

The slow part is inc1 in primes and inc2 in primes. I'll look at why PyPy is so slow (thanks for the performance bug report, basically). Note that as you mentioned the code can be made incredibly faster (both on PyPy and on CPython) --- in this case, just by copying the primes deque into a set just before the for loop.

like image 86
Armin Rigo Avatar answered Sep 28 '22 16:09

Armin Rigo