Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write __getitem__ cleanly?

In Python, when implementing a sequence type, I often (relatively speaking) find myself writing code like this:

class FooSequence(collections.abc.Sequence):
    # Snip other methods

    def __getitem__(self, key):
        if isinstance(key, int):
            # Get a single item
        elif isinstance(key, slice):
            # Get a whole slice
        else:
            raise TypeError('Index must be int, not {}'.format(type(key).__name__))

The code checks the type of its argument explicitly with isinstance(). This is regarded as an antipattern within the Python community. How do I avoid it?

  • I cannot use functools.singledispatch, because that's quite deliberately incompatible with methods (it will attempt to dispatch on self, which is entirely useless since we're already dispatching on self via OOP polymorphism). It works with @staticmethod, but what if I need to get stuff out of self?
  • Casting to int() and then catching the TypeError, checking for a slice, and possibly re-raising is still ugly, though perhaps slightly less so.
  • It might be cleaner to convert integers into one-element slices and handle both situations with the same code, but that has its own problems (return 0 or [0]?).
like image 895
Kevin Avatar asked Jan 12 '15 03:01

Kevin


People also ask

What is __ Getitem __ in Python?

__getitem__() is a magic method in Python, which when used in a class, allows its instances to use the [] (indexer) operators. Say x is an instance of this class, then x[i] is roughly equivalent to type(x). __getitem__(x, i) .

What is __ index __ method in Python?

Python's __index__(self) method is called on an object to get its associated integer value. The returned integer is used in slicing or as the basis for the conversion in the built-in functions bin() , hex() , and oct() .


2 Answers

As much as it seems odd, I suspect that the way you have it is the best way to go about things. Patterns generally exist to encompass common use cases, but that doesn't mean that they should be taken as gospel when following them makes life more difficult. The main reason that PEP 443 gives for balking at explicit typechecking is that it is "brittle and closed to extension". However, that mainly applies to custom functions that take a number of different types at any time. From the Python docs on __getitem__:

For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.

The Python documentation explicitly states the two types that should be accepted, and what to do if an item that is not of those two types is provided. Given that the types are provided by the documentation itself, it's unlikely to change (doing so would break far more implementations than just yours), so it's likely not worth the trouble to go out of your way to code against Python itself potentially changing.

If you're set on avoiding explicit typechecking, I would point you toward this SO answer. It contains a concise implementation of a @methdispatch decorator (not my name, but i'll roll with it) that lets @singledispatch work with methods by forcing it to check args[1] (arg) rather than args[0] (self). Using that should allow you to use custom single dispatch with your __getitem__ method.

Whether or not you consider either of these "pythonic" is up to you, but remember that while The Zen of Python notes that "Special cases aren't special enough to break the rules", it then immediately notes that "practicality beats purity". In this case, just checking for the two types that the documentation explicitly states are the only things __getitem__ should support seems like the practical way to me.

like image 84
Seven Deadly Sins Avatar answered Oct 12 '22 19:10

Seven Deadly Sins


The antipattern is for code to do explicit type checking, which means using the type() function. Why? Because then a subclass of the target type will no longer work. For instance, __getitem__ can use an int, but using type() to check for it means an int-subclass, which would work, will fail only because type() does not return int.

When a type-check is necessary, isinstance is the appropriate way to do it as it does not exclude subclasses.

When writing __dunder__ methods, type checking is necessary and expected -- using isinstance().

In other words, your code is perfectly Pythonic, and its only problem is the error message (it doesn't mention slices).

like image 21
Ethan Furman Avatar answered Oct 12 '22 18:10

Ethan Furman