I made a little generator function for character ranges:
>>> def crange(start, end): ... for i in range(ord(start), ord(end)+1): ... yield chr(i) ...
And then I can do this:
>>> print(*crange('a','e')) a b c d e
Yay! But this doesn't work:
>>> crange('a','e')[::2] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'generator' object is not subscriptable
And this works, but is O(n), unlike range
's O(1):
>>> 'y' in crange('a','z') True
That means it takes about 0.35 seconds to search for character number 109,999 out of the maximum of 110,000. 109999 in range(110000)
is, of course, fast.
At that point, my first thought was to simply subclass range. Unfortunately:
>>> class A(range): ... pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type 'range' is not an acceptable base type
So I guess I would have to mimic it in some way that allows me to pass characters as arguments, works like range
internally, and produces characters. Unfortunately, I'm not sure how to proceed. I tried a dir()
:
>>> print(*dir(range), sep='\n') __class__ __contains__ __delattr__ __dir__ __doc__ __eq__ __format__ __ge__ __getattribute__ __getitem__ __gt__ __hash__ __init__ __iter__ __le__ __len__ __lt__ __ne__ __new__ __reduce__ __reduce_ex__ __repr__ __reversed__ __setattr__ __sizeof__ __str__ __subclasshook__ count index start step stop
which lets me see what functions are in there, but I'm not sure what they're doing, or how range
uses them. I looked for the source for range
, but it's in C, and I don't know where to find its Python wrapper (it does have one, right?).
Where do I go from here, and should I even go there?
At that point, my first thought was to simply subclass range.
range
was a function in Python2 and a "final" class in Python3 (more info here) - in both cases not something you can sub-class. You will need to create a class crange
that extends from an object
as the base type.
class crange(object):
And this works, but is O(n), unlike range's O(1)
In Python 3, there is a __contains__
method that you will define for your object.
For objects that don’t define
__contains__()
, the membership test first tries iteration via__iter__()
, then the old sequence iteration protocol via__getitem__()
, see this section in the language reference.
This allows Python to determine if the value is in your range without actually enumerating the range.
For a simple example, if your range is 1 to 1,000,000, it is trivial to determine whether 23546 is in that range (1 < 23546 < 1000000
). Of course the actual implementation is a bit more complex and adds ability to handle step increments etc.
Regarding:
Yay! But this doesn't work:
>>> crange('a','e')[::2]
In this case you need to define __getitem__
on your object. Here's an example of some of the methods required:
class crange(object): def __init__(self, start, end, step=1): # initialize your range object self.start = start self.end = end self.step = step def __iter__(self): # enable iteration over your object # (assume step size is 1) for i in range(ord(self.start), ord(self.end)+1): yield chr(i) def __getitem__(self, i): # enable accessing items in your range by index # also enable crange('a','e')[::2] # (assuming step size of 1) if isinstance( i, slice ): # implement slicing else: return chr(ord(self.start) + i) def __contains__(self, char): # enable O(1) determination of whether a value is in your range # (assume step size is 1) return ord(self.start) <= ord(char) < ord(self.end) def __len__(self): # return length (assuming step size of 1) return ord(self.end) - ord(self.start)
To add to Martin Konecny's answer. You probably want to use an internal range for everything and convert between chr and ord.
class crange: def __init__(self, *args, **kwargs): args = [ord(arg) for arg in args] kwargs = {key: ord(val) for key, val in kwargs.items()} self.range = range(*args, **kwargs) def __iter__(self): for n in self.range: yield chr(n) def __contains__(self, c): return ord(c) in self.range def __getitem__(self, i): if isinstance(i, slice): ret = crange('\x00') ret.range = self.range[i] return ret else: return chr(self.range[i]) def __repr__(self): return "crange({}, {})".format( repr(chr(self.range.start)), repr(chr(self.range.stop))) r = crange('a', 'f') print(list(r)) print('b' in r) print('f' in r) print(r[:2])
In other words: if we can't subclass it we can use object composition.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With