Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slice that represents concatenated slices

The indices of a slice slice(start, stop[, step]) can be often represented by range(start, stop, step) (or range(*slice(start, stop, step).indices(length)) when taking the underlying dimensions into account).

Let's say I have even two, multidimensional slices and the second slice could be used as a slice into the result of applying the first slice.

Example:

import numpy as np
data = np.random.rand(*(100, 100, 100))
a = data[::2, 7, :] # slice 1, a.shape = (50,100)
b = a[1, ::-1] # slice 2, b.shape = (100,)

I would like to find a general expression for calculating the single slice that does the same job. I know the dimensions of the underlying data structure.

c = data[2, 7, ::-1] # same as b
np.array_equal(b, c) # True

So, in getting from [::2, 7, :] and [1, ::-1] to [2, 7, ::-1] in this example I would need a function like:

def concatenate_slices(shape, outer_slice, inner_slice):
    ...
    return combined_slice

where outer_slice and inner_slice would both be a tuple of slices. In the example shape=(100, 100, 100) and outer_slice=(slice(None, None, 2), 7, slice(None, None, None)) and inner_slice=(1, slice(None, None, -1)).

I'm not sure how to do that efficiently.

My objects do something when __getitem__(slice) is called (no intermediate views) and I want to do that only once but still have the possibility to have slices of slices.

As an extension (optional) I would like to know what happens if I have ellipses in the slices. How can I then make the combination?

like image 658
Trilarion Avatar asked Oct 31 '22 12:10

Trilarion


2 Answers

I suspect you just have to go through the tedium of a analyzing each dimension, to build up either a new slice or an array of indices. I doubt if there's a short cut.

To illustrate take your example:

In [77]: shape=(100,100,100)
In [78]: outer_slice=(slice(None, None, 2), 7, slice(None, None, None))
In [79]: inner_slice=(1, slice(None, None, -1))

The target is (right?):

(2, 7, slice(None,None,-1))

First dimension - make an array of the whole range of indices, and slice them sequentially:

In [80]: idx=np.arange(shape[0])
In [81]: idx[outer_slice[0]][inner_slice[0]]
Out[81]: 2

Could I deduce that from [::2] and [1]? I have to reason that it starts at 0, the shape is large enough to yield a 2nd value, etc

Now for the 2nd dimension. That's a scalar, so there's no corresponding inner slice.

In [82]: outer_slice[1]
Out[82]: 7

For the third, let's do that same as with the 1st, but taking into account the offset between outer and inner lists:

In [83]: idx=np.arange(shape[2])
In [84]: idx[outer_slice[2]][inner_slice[1]]
Out[84]: 
array([99, 98, 97, 96, 95, 94, 93, 92, 91,  ....7,  6,  5,  4,  3,  2,  1,  0])

Or I could deduce that outer_slice[2] does nothing, so I can use inner_slice[1] directly.

Of course it would be just as easy and efficient to apply the two slice tuples to the actual array.

X[outer_slice][inner_slice]

As long as outer_slice produces a view, combining them into a composite slice is not much of an improvement.

With shape and the slice tuples there's enough information to build a new tuple. But it appears that the required logic will be quite involved, and require an in depth knowledge of slicing, and lots of testing.

like image 97
hpaulj Avatar answered Nov 15 '22 05:11

hpaulj


Let's start with the simple case: 1-d arrays. We'll need to keep track of the start, stop, and step values for the final slice, which we can update like so:

def update_1d(a, b, length):
  a_start, a_stop, a_step = a.indices(length)
  a_length = len(xrange(a_start, a_stop, a_step))
  if a_length == 0:
    # doesn't matter what b is if data[a] is []
    return a
  b_start, b_stop, b_step = b.indices(a_length)
  b_length = len(xrange(b_start, b_stop, b_step))
  if b_length == 0:
    # result will be empty, so we can exit early
    return slice(0, 0, 1)
  # convert b's start into a's coordinates, without wrapping around
  start = max(0, a_start + b_start * a_step)
  # steps are multiplicative, which makes things easy
  step = a_step * b_step
  # the stop index is the hard part because it depends on the sign of both steps
  x = a_start + b_stop * a_step
  if step < 0:
    # indexing backwards, so truncate if b's converted step goes below zero
    stop = x if x >= 0 else None
  elif a_step > 0:
    # both steps are positive, so take the smallest stop index
    stop = min(a_stop, x)
  else:
    # both steps are negative, so take the largest stop index
    stop = max(a_stop, x)
  return slice(start, stop, step)

Note that this expects a and b to be slices. You can usually convert other forms into slice objects, though. This even includes Ellipsis objects, assuming you know how many dimensions you have.

To extend this to the multidimensional case, we'll need to do some bookkeeping to keep track of which original dimension is getting sliced. For example, if you have data[::2, 7, :][:, 2:-2], you'll have to map the second dimension of the second slice to the third dimension of the first slice.

like image 40
perimosocordiae Avatar answered Nov 15 '22 04:11

perimosocordiae