Here is a quote from https://stackoverflow.com/users/893/greg-hewgill answer to Explain Python's slice notation.
Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.
So when the error is prefered, what is the Pythonic way to proceed ? Is there a more Pythonic way to rewrite this example ?
class ParseError(Exception):
pass
def safe_slice(data, start, end):
"""0 <= start <= end is assumed"""
r = data[start:end]
if len(r) != end - start:
raise IndexError
return r
def lazy_parse(data):
"""extract (name, phone) from a data buffer.
If the buffer could not be parsed, a ParseError is raised.
"""
try:
name_length = ord(data[0])
extracted_name = safe_slice(data, 1, 1 + name_length)
phone_length = ord(data[1 + name_length])
extracted_phone = safe_slice(data, 2 + name_length, 2 + name_length + phone_length)
except IndexError:
raise ParseError()
return extracted_name, extracted_phone
if __name__ == '__main__':
print lazy_parse("\x04Jack\x0A0123456789") # OK
print lazy_parse("\x04Jack\x0A012345678") # should raise ParseError
edit: the example was simpler to write using byte strings but my real code is using lists.
Here's one way that is arguably more Pythonic. If you want to parse a byte string you can use the struct
module that is provided for that exact purpose:
import struct
from collections import namedtuple
Details = namedtuple('Details', 'name phone')
def lazy_parse(data):
"""extract (name, phone) from a data buffer.
If the buffer could not be parsed, a ParseError is raised.
"""
try:
name = struct.unpack_from("%dp" % len(data), data)[0]
phone = struct.unpack_from("%dp" % (len(data)-len(name)-1), data, len(name)+1)[0]
except struct.error:
raise ParseError()
return Details(name, phone)
What I still find unpythonic about that is throwing away the useful struct.error traceback to replace with a ParseError whatever that is: the original tells you what is wrong with the string, the latter only tells you that something is wrong.
Using a function like safe_slice would be faster than creating an object just to perform the slice, but if speed is not a bottleneck and you are looking for a nicer interface, you could define a class with a __getitem__
to perform checks before returning the slice.
This allows you to use nice slice notation instead of having to pass both the start
and stop
arguments to safe_slice
.
class SafeSlice(object):
# slice rules: http://docs.python.org/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange
def __init__(self,seq):
self.seq=seq
def __getitem__(self,key):
seq=self.seq
if isinstance(key,slice):
start,stop,step=key.start,key.stop,key.step
if start:
seq[start]
if stop:
if stop<0: stop=len(seq)+stop
seq[stop-1]
return seq[key]
seq=[1]
print(seq[:-2])
# []
print(SafeSlice(seq)[:-1])
# []
print(SafeSlice(seq)[:-2])
# IndexError: list index out of range
If speed is an issue, then I suggest just testing the end points instead of doing arithmetic. Item access for Python lists is O(1). The version of safe_slice
below also allows you to pass 2,3 or 4 arguments. With just 2 arguments, the second will be interpreted as the stop value, (similar to range
).
def safe_slice(seq, start, stop=None, step=1):
if stop is None:
stop=start
start=0
else:
seq[start]
if stop<0: stop=len(seq)+stop
seq[stop-1]
return seq[start:stop:step]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With