Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy slicing with bound checks

Tags:

python

numpy

Does numpy offer a way to do bound checking when you slice an array? For example if I do:

arr = np.ones([2,2])
sliced_arr = arr[0:5,:]

This slice will be okay and it will just return me the entire of arr even though I asked for indexes that don't exist. Is there another way to slice in numpy that will throw an error if I try to slice out of the bounds of an array?

like image 736
Burton2000 Avatar asked Feb 11 '19 14:02

Burton2000


1 Answers

This ended up a bit longer than expected, but you can write your own wrapper that checks the get operations to make sure that slices do not go beyond limits (indexing arguments that are not slices are already checked by NumPy). I think I covered all cases here (ellipsis, np.newaxis, negative steps...), although there might be some failing corner case still.

import numpy as np

# Wrapping function
def bounds_checked_slice(arr):
    return SliceBoundsChecker(arr)

# Wrapper that checks that indexing slices are within bounds of the array
class SliceBoundsChecker:

    def __init__(self, arr):
        self._arr = np.asarray(arr)

    def __getitem__(self, args):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__getitem__(args)

    def __setitem__(self, args, value):
        # Slice bounds checking
        self._check_slice_bounds(args)
        return self._arr.__setitem__(args, value)

    # Check slices in the arguments are within bounds
    def _check_slice_bounds(self, args):
        if not isinstance(args, tuple):
            args = (args,)
        # Iterate through indexing arguments
        arr_dim = 0
        i_arg = 0
        for i_arg, arg in enumerate(args):
            if isinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim += 1
            elif arg is Ellipsis:
                break
            elif arg is np.newaxis:
                pass
            else:
                arr_dim += 1
        # Go backwards from end after ellipsis if necessary
        arr_dim = -1
        for arg in args[:i_arg:-1]:
            if isinstance(arg, slice):
                self._check_slice(arg, arr_dim)
                arr_dim -= 1
            elif arg is Ellipsis:
                raise IndexError("an index can only have a single ellipsis ('...')")
            elif arg is np.newaxis:
                pass
            else:
                arr_dim -= 1

    # Check a single slice
    def _check_slice(self, slice, axis):
        size = self._arr.shape[axis]
        start = slice.start
        stop = slice.stop
        step = slice.step if slice.step is not None else 1
        if step == 0:
            raise ValueError("slice step cannot be zero")
        bad_slice = False
        if start is not None:
            start = start if start >= 0 else start + size
            bad_slice |= start < 0 or start >= size
        else:
            start = 0 if step > 0 else size - 1
        if stop is not None:
            stop = stop if stop >= 0 else stop + size
            bad_slice |= (stop < 0 or stop > size) if step > 0 else (stop < 0 or stop >= size)
        else:
            stop = size if step > 0 else -1
        if bad_slice:
            raise IndexError("slice {}:{}:{} is out of bounds for axis {} with size {}".format(
                slice.start if slice.start is not None else '',
                slice.stop if slice.stop is not None else '',
                slice.step if slice.step is not None else '',
                axis % self._arr.ndim, size))

A small demo:

import numpy as np

a = np.arange(24).reshape(4, 6)
print(bounds_checked_slice(a)[:2, 1:5])
# [[ 1  2  3  4]
#  [ 7  8  9 10]]
bounds_checked_slice(a)[:2, 4:10]
# IndexError: slice 4:10: is out of bounds for axis 1 with size 6

If you wanted, you could even make this a subclass of ndarray, so you get this behavior by default, instead of having to wrap the array every time.

Also, note that there may be some variations as to what you may consider to be "out of bounds". The code above considers that going even one index beyond the size is out of bounds, meaning that you cannot take an empty slice with something like arr[len(arr):]. You could in principle edit the code if you were thinking of a slightly different behavior.

like image 179
jdehesa Avatar answered Sep 28 '22 20:09

jdehesa