Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combining two slicing operations

Tags:

python

numpy

Is there a smart and easy way to combine two slicing operations into one?

Say I have something like

arange(1000)[::2][10:20]
>>> array([20, 22, 24, 26, 28, 30, 32, 34, 36, 38])

Of course in this example this is not a problem, but if the arrays are very large I would very much like to avoid creating the intermediate array (or is there none?). I believe it should be possible to combine the two slices but maybe I'm overseeing something. So the idea would be something like:

arange(1000)[ slice(None,None,2) + slice(10,20,None) ]

This of course does not work but is what i would like to do. Is there anything that does combine slicing objects? (despite my efforts I did not find anything).

like image 294
Magellan88 Avatar asked Oct 08 '13 20:10

Magellan88


People also ask

How do you concatenate slices?

Whenever we talk about appending elements to a slice, we know that we need to use the append() function that takes a slice as the first argument and the values that we want to append as the next argument.

What is array slicing with example?

Common examples of array slicing are extracting a substring from a string of characters, the "ell" in "hello", extracting a row or column from a two-dimensional array, or extracting a vector from a matrix. Depending on the programming language, an array slice can be made out of non-consecutive elements.

How do you slice and add in Python?

Append a item to the list using append() function. Change value of an item in a list by specifying an index and its value lst[n] = 'some value' Perform slice operation, slice operation syntax is lst[begin:end] Leaving the begin one empty lst[:m] gives the list from 0 to m.


2 Answers

  1. You can subclass slice to make such superposition of slices possible. Just override __add__ (or __mul__ - a mathematician would surely prefer * notation for superposition). But it is going to invoke some math. By the way, you could make a nice Python package with this stuff ;-)
  2. As bheklilr said, slicing costs nothing in NumPy. So you can just go on with a simple solution like list of slices.

P. S. In general, multiple slicing can be used to make code nicer and much more clear. Even a simple choice between one of the following lines:

v = A[::2][10:20]
v = A[20:40][::2]
v = A[20:40:2]

can deeply reflect program logic, making code self-documenting.

One more example: if you have a flat NumPy array and you wish to extract a subarray in position position of length length, you can do

v = A[position : position + length]

or

v = A[position:][:length]

Decide for yourself which option looks better. ;-)

like image 135
Tigran Saluev Avatar answered Sep 18 '22 12:09

Tigran Saluev


As @Tigran said, slicing costs nothing when using Numpy arrays. However, in general we can combine two slices in series using info from slice.indices, which

Retrieve[s] the start, stop, and step indices from the slice object slice assuming a sequence of length length

We can reduce

x[slice1][slice2]

to

x[combined]

The first slicing returns a new object, which is then sliced by the second slicing. So, we'll also need the length of our data object to combine the slices properly. (Length in the first dimension)

So, we can write

def slice_combine(slice1, slice2, length):
    """
    returns a slice that is a combination of the two slices.
    As in 
      x[slice1][slice2]
    becomes
      combined_slice = slice_combine(slice1, slice2, len(x))
      x[combined_slice]

    :param slice1: The first slice
    :param slice2: The second slice
    :param length: The length of the first dimension of data being sliced. (eg len(x))
    """

    # First get the step sizes of the two slices.
    slice1_step = (slice1.step if slice1.step is not None else 1)
    slice2_step = (slice2.step if slice2.step is not None else 1)

    # The final step size
    step = slice1_step * slice2_step

    # Use slice1.indices to get the actual indices returned from slicing with slice1
    slice1_indices = slice1.indices(length)

    # We calculate the length of the first slice
    slice1_length = (abs(slice1_indices[1] - slice1_indices[0]) - 1) // abs(slice1_indices[2])

    # If we step in the same direction as the start,stop, we get at least one datapoint
    if (slice1_indices[1] - slice1_indices[0]) * slice1_step > 0:
        slice1_length += 1
    else:
        # Otherwise, The slice is zero length.
        return slice(0,0,step)

    # Use the length after the first slice to get the indices returned from a
    # second slice starting at 0.
    slice2_indices = slice2.indices(slice1_length)

    # if the final range length = 0, return
    if not (slice2_indices[1] - slice2_indices[0]) * slice2_step > 0:
        return slice(0,0,step)

    # We shift slice2_indices by the starting index in slice1 and the 
    # step size of slice1
    start = slice1_indices[0] + slice2_indices[0] * slice1_step
    stop = slice1_indices[0] + slice2_indices[1] * slice1_step

    # slice.indices will return -1 as the stop index when slice.stop should be set to None.
    if start > stop:
        if stop < 0:
            stop = None

    return slice(start, stop, step)

Then, let's run some tests

import sys
import numpy as np

# Make a 1D dataset
x = np.arange(100)
l = len(x)

# Make a (100, 10) dataset
x2 = np.arange(1000)
x2 = x2.reshape((100,10))
l2 = len(x2)

# Test indices and steps
indices = [None, -1000, -100, -99, -50, -10, -1, 0, 1, 10, 50, 99, 100, 1000]
steps = [-1000, -99, -50, -10, -3, -2, -1, 1, 2, 3, 10, 50, 99, 1000]
indices_l = len(indices)
steps_l = len(steps)

count = 0
total = 2 * indices_l**4 * steps_l**2
for i in range(indices_l):
    for j in range(indices_l):
        for k in range(steps_l):
            for q in range(indices_l):
                for r in range(indices_l):
                    for s in range(steps_l):
                        # Print the progress. There are a lot of combinations.
                        if count % 5197 == 0:
                            sys.stdout.write("\rPROGRESS: {0:,}/{1:,} ({2:.0f}%)".format(count, total, float(count) / float(total) * 100))
                            sys.stdout.flush()

                        slice1 = slice(indices[i], indices[j], steps[k])
                        slice2 = slice(indices[q], indices[r], steps[s])

                        combined = slice_combine(slice1, slice2, l)
                        combined2 = slice_combine(slice1, slice2, l2)
                        np.testing.assert_array_equal(x[slice1][slice2], x[combined], 
                            err_msg="For 1D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined, count))
                        np.testing.assert_array_equal(x2[slice1][slice2], x2[combined2], 
                            err_msg="For 2D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined2, count))

                        # 2 tests per loop
                        count += 2

print("\n-----------------")
print("All {0:,} tests passed!".format(count))

And thankfully we get

All 15,059,072 tests passed!

like image 23
well Avatar answered Sep 18 '22 12:09

well