Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using __slots__ under PyPy

I have this simple code that helped me to measure how classes with __slots__ perform (taken from here):

import timeit

def test_slots():
    class Obj(object):
        __slots__ = ('i', 'l')

        def __init__(self, i):
            self.i = i
            self.l = []

    for i in xrange(1000):
        Obj(i)

print timeit.Timer('test_slots()', 'from __main__ import test_slots').timeit(10000)

If I run it via python2.7 - I would get something around 6 seconds - ok, it's really faster (and also more memory-efficient) than without slots.

But, if I run the code under PyPy (using 2.2.1 - 64bit for Mac OS/X), it starts to use 100% CPU and "never" returns (waited for minutes - no result).

What is going on? Should I use __slots__ under PyPy?

Here's what happens if I pass different number to timeit():

timeit(10) - 0.067s
timeit(100) - 0.5s
timeit(1000) - 19.5s
timeit(10000) - ? (probably more than a Game of Thrones episode)

Thanks in advance.


Note that the same behavior is observed if I use namedtuples:

import collections
import timeit

def test_namedtuples():
    Obj = collections.namedtuple('Obj', 'i l')

    for i in xrange(1000):
      Obj(i, [])

print timeit.Timer('test_namedtuples()', 'from __main__ import test_namedtuples').timeit(10000)
like image 321
alecxe Avatar asked Apr 14 '14 18:04

alecxe


People also ask

Should I use __ slots __?

"You would want to use __slots__ if you are going to instantiate a lot (hundreds, thousands) of objects of the same class." Abstract Base Classes, for example, from the collections module, are not instantiated, yet __slots__ are declared for them.

What is the __ slots __ attribute used in a class for?

__slots__ is a class variable. If you have more than one instance of your class, any change made to __slots__ will show up in every instance. You cannot access the memory allocated by the __slots__ declaration by using subscription. You will get only what is currently stored in the list.


2 Answers

In each of the 10,000 or so iterations of the timeit code, the class is recreated from scratch. Creating classes is probably not a well-optimized operation in PyPy; even worse, doing so will probably discard all of the optimizations that the JIT learned about the previous incarnation of the class. PyPy tends to be slow until the JIT has warmed up, so doing things that require it to warm up repeatedly will kill your performance.

The solution here is, of course, to simply move the class definition outside of the code being benchmarked.

like image 193
kwatford Avatar answered Oct 23 '22 03:10

kwatford


To directly answer the question in the title: __slots__ is pointless for (but doesn't hurt) performance in PyPy.

like image 8
Armin Rigo Avatar answered Oct 23 '22 02:10

Armin Rigo