Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to detect sequence parameter?

I want to write a function that accepts a parameter which can be either a sequence or a single value. The type of value is str, int, etc., but I don't want it to be restricted to a hardcoded list. In other words, I want to know if the parameter X is a sequence or something I have to convert to a sequence to avoid special-casing later. I could do

type(X) in (list, tuple)

but there may be other sequence types I'm not aware of, and no common base class.

-N.

Edit: See my "answer" below for why most of these answers don't help me. Maybe you have something better to suggest.

like image 805
noamtm Avatar asked Nov 20 '08 13:11

noamtm


2 Answers

As of 2.6, use abstract base classes.

>>> import collections
>>> isinstance([], collections.Sequence)
True
>>> isinstance(0, collections.Sequence)
False

Furthermore ABC's can be customized to account for exceptions, such as not considering strings to be sequences. Here an example:

import abc
import collections

class Atomic(object):
    __metaclass__ = abc.ABCMeta
    @classmethod
    def __subclasshook__(cls, other):
        return not issubclass(other, collections.Sequence) or NotImplemented

Atomic.register(basestring)

After registration the Atomic class can be used with isinstance and issubclass:

assert isinstance("hello", Atomic) == True

This is still much better than a hard-coded list, because you only need to register the exceptions to the rule, and external users of the code can register their own.

Note that in Python 3 the syntax for specifying metaclasses changed and the basestring abstract superclass was removed, which requires something like the following to be used instead:

class Atomic(metaclass=abc.ABCMeta):
    @classmethod
    def __subclasshook__(cls, other):
        return not issubclass(other, collections.Sequence) or NotImplemented

Atomic.register(str)

If desired, it's possible to write code which is compatible both both Python 2.6+ and 3.x, but doing so requires using a slightly more complicated technique which dynamically creates the needed abstract base class, thereby avoiding syntax errors due to the metaclass syntax difference. This is essentially the same as what Benjamin Peterson's six module'swith_metaclass()function does.

class _AtomicBase(object):
    @classmethod
    def __subclasshook__(cls, other):
        return not issubclass(other, collections.Sequence) or NotImplemented

class Atomic(abc.ABCMeta("NewMeta", (_AtomicBase,), {})):
    pass

try:
    unicode = unicode
except NameError:  # 'unicode' is undefined, assume Python >= 3
    Atomic.register(str)  # str includes unicode in Py3, make both Atomic
    Atomic.register(bytes)  # bytes will also be considered Atomic (optional)
else:
    # basestring is the abstract superclass of both str and unicode types
    Atomic.register(basestring)  # make both types of strings Atomic

In versions before 2.6, there are type checkers in theoperatormodule.

>>> import operator
>>> operator.isSequenceType([])
True
>>> operator.isSequenceType(0)
False
like image 192
A. Coady Avatar answered Sep 22 '22 14:09

A. Coady


The problem with all of the above mentioned ways is that str is considered a sequence (it's iterable, has getitem, etc.) yet it's usually treated as a single item.

For example, a function may accept an argument that can either be a filename or a list of filenames. What's the most Pythonic way for the function to detect the first from the latter?

Based on the revised question, it sounds like what you want is something more like:

def to_sequence(arg):
    ''' 
    determine whether an arg should be treated as a "unit" or a "sequence"
    if it's a unit, return a 1-tuple with the arg
    '''
    def _multiple(x):  
        return hasattr(x,"__iter__")
    if _multiple(arg):  
        return arg
    else:
        return (arg,)

>>> to_sequence("a string")
('a string',)
>>> to_sequence( (1,2,3) )
(1, 2, 3)
>>> to_sequence( xrange(5) )
xrange(5)

This isn't guaranteed to handle all types, but it handles the cases you mention quite well, and should do the right thing for most of the built-in types.

When using it, make sure whatever receives the output of this can handle iterables.

like image 36
Gregg Lind Avatar answered Sep 22 '22 14:09

Gregg Lind