Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Python 3.4's enums without significant slowdown?

I was writing a tic-tac-toe game and using an Enum to represent the three outcomes -- lose, draw, and win. I thought it would be better style than using the strings ("lose", "win", "draw") to indicate these values. But using enums gave me a significant performance hit.

Here's a minimal example, where I simply reference either Result.lose or the literal string lose.

import enum
import timeit
class Result(enum.Enum):
    lose = -1
    draw = 0
    win = 1

>>> timeit.timeit('Result.lose', 'from __main__ import Result')
1.705788521998329
>>> timeit.timeit('"lose"', 'from __main__ import Result')
0.024598151998361573

This is much slower than simply referencing a global variable.

k = 12

>>> timeit.timeit('k', 'from __main__ import k')
0.02403248500195332

My questions are:

  • I know that global lookups are much slower than local lookups in Python. But why are enum lookups even worse?
  • How can enums be used effectively without sacrificing performance? Enum lookup turned out to be completely dominating the runtime of my tic-tac-toe program. We could save local copies of the enum in every function, or wrap everything in a class, but both of those seem awkward.
like image 766
Eli Rose Avatar asked Jun 12 '15 22:06

Eli Rose


People also ask

Are enums slow in Python?

It's kind of slow. We were using enums a lot in our code, until we noticed that enum overhead takes up single digit percentages of our CPU time! Luckily, it only took a few hours to write a much faster implementation with almost the same functionality.

Should you use enums in Python?

Python enums are useful to represent data that represent a finite set of states such as days of the week, months of the year, etc. They were added to Python 3.4 via PEP 435. However, it is available all the way back to 2.4 via pypy. As such, you can expect them to be a staple as you explore Python programming.

What does enum Auto () do?

Python's enum module provides a convenient function called auto() that allows you to set automatic values for your enum members. This function's default behavior is to assign consecutive integer values to members. You need to call auto() once for each automatic value that you need.


1 Answers

You are timing the timing loop. A string literal on its own is ignored entirely:

>>> import dis
>>> def f(): "lose"
... 
>>> dis.dis(f)
  1           0 LOAD_CONST               1 (None)
              3 RETURN_VALUE        

That's a function that does nothing at all. So the timing loop takes 0.024598151998361573 seconds to run 1 million times.

In this case, the string actually became the docstring of the f function:

>>> f.__doc__
'lose'

but CPython generally will omit string literals in code if not assigned or otherwise part of an expression:

>>> def f():
...     1 + 1
...     "win"
... 
>>> dis.dis(f)
  2           0 LOAD_CONST               2 (2)
              3 POP_TOP             

  3           4 LOAD_CONST               0 (None)
              7 RETURN_VALUE        

Here the 1 + 1 as folded into a constant (2), and the string literal is once again gone.

As such, you cannot compare this to looking up an attribute on an enum object. Yes, looking up an attribute takes cycles. But so does looking up another variable. If you really are worried about performance, you can always cache the attribute lookup:

>>> import timeit
>>> import enum
>>> class Result(enum.Enum):
...     lose = -1
...     draw = 0
...     win = 1
... 
>>> timeit.timeit('outcome = Result.lose', 'from __main__ import Result')
1.2259576459764503
>>> timeit.timeit('outcome = lose', 'from __main__ import Result; lose = Result.lose')
0.024848614004440606

In timeit tests all variables are locals, so both Result and lose are local lookups.

enum attribute lookups do take a little more time than 'regular' attribute lookups:

>>> class Foo: bar = 'baz'
... 
>>> timeit.timeit('outcome = Foo.bar', 'from __main__ import Foo')
0.04182224802207202

That's because the enum metaclass includes a specialised __getattr__ hook that is called each time you look up an attribute; attributes of an enum class are looked up in a specialised dictionary rather than the class __dict__. Both executing that hook method and the additional attribute lookup (to access the map) take additional time:

>>> timeit.timeit('outcome = Result._member_map_["lose"]', 'from __main__ import Result')
0.25198313599685207
>>> timeit.timeit('outcome = map["lose"]', 'from __main__ import Result; map = Result._member_map_')
0.14024519600206986

In a game of Tic-Tac-Toe you don't generally worry about what comes down to insignificant timing differences. Not when the human player is orders of magnitude slower than your computer. That human player is not going to notice the difference between 1.2 microseconds or 0.024 microseconds.

like image 106
Martijn Pieters Avatar answered Sep 22 '22 00:09

Martijn Pieters