>>> class Potato(object):
... def __getslice__(self, start, stop):
... print start, stop
...
>>> sys.maxint
9223372036854775807
>>> x = sys.maxint + 69
>>> print x
9223372036854775876
>>> Potato()[123:x]
123 9223372036854775807
Why the call to getslice doesn't respect the stop
I sent in, instead silently substituting 2^63 - 1? Does it mean that implementing __getslice__
for your own syntax will generally be unsafe with longs?
I can do whatever I need with __getitem__
anyway, I'm just wondering why __getslice__
is apparently broken.
Edit: Where is the code in CPython which truncates the slice? Is this part of python (language) spec or just a "feature" of cpython (implementation)?
The Python C code that handles slicing for objects that implement the sq_slice
slot, cannot handle any integers over Py_ssize_t
(== sys.maxsize
). The sq_slice
slot is the C-API equivalent of the __getslice__
special method.
For a two-element slice, Python 2 uses one of the SLICE+*
opcodes; this is then handled by the apply_slice()
function. This uses the _PyEval_SliceIndex
function to convert the Python index objects (int
, long
, or anything implementing the __index__
method) to a Py_ssize_t
integer. The method has the following comment:
/* Extract a slice index from a PyInt or PyLong or an object with the
nb_index slot defined, and store in *pi.
Silently reduce values larger than PY_SSIZE_T_MAX to PY_SSIZE_T_MAX,
and silently boost values less than -PY_SSIZE_T_MAX-1 to -PY_SSIZE_T_MAX-1.
Return 0 on error, 1 on success.
*/
This means that any slicing in Python 2 using the 2-value syntax is limited to values in the sys.maxsize
range when a sq_slice
slot is provided.
Slicing using the three-value form (item[start:stop:stride]
) uses the BUILD_SLICE
opcode instead (followed by BINARY_SUBSCR
) and this instead creates a slice()
object without limiting to sys.maxsize
.
If the object doesn't implement a sq_slice()
slot (so no __getslice__
is present) the apply_slice()
function also falls back to using a slice()
object.
As for this being an implementation detail or part of the language: the Slicings expression documentation distinguishes between simple_slicing
and extended_slicing
; the former only permits the short_slice
form. For simple slicing the indices must be plain integers:
The lower and upper bound expressions, if present, must evaluate to plain integers; defaults are zero and the
sys.maxint
, respectively.
This suggests that Python 2 the language limits the indices to sys.maxint
values, disallowing long integers. In Python 3 simple slicing has been excised from the language altogether.
If your code has to support slicing with values beyond sys.maxsize
and you have to inherit from a type that implements __getslice__
then your options are to:
use the three-value syntax, with None
for the stride:
Potato()[123:x:None]
to create slice()
objects explicitly:
Potato()[slice(123, x)]
slice()
objects can handle long
integers just fine; however the slice.indices()
method cannot handle lengths over sys.maxsize
still:
>>> import sys
>>> s = slice(0, sys.maxsize + 1)
>>> s
slice(0, 9223372036854775808L, None)
>>> s.stop
9223372036854775808L
>>> s.indices(sys.maxsize + 2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: cannot fit 'long' into an index-sized integer
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With