I have a numpy 2D array, and I would like to select different sized ranges of this array, depending on the column index. Here is the input array a = np.reshape(np.array(range(15)), (5, 3))
example
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
Then, list b = [4,3,1]
determines the different range sizes for each column slice, so that we would get the arrays
[0 3 6 9]
[1 4 7]
[2]
which we can concatenate and flatten to get the final desired output
[0 3 6 9 1 4 7 2]
Currently, to perform this task, I am using the following code
slices = []
for i in range(a.shape[1]):
slices.append(a[:b[i],i])
c = np.concatenate(slices)
and, if possible, I want to convert it to a pythonic format.
Bonus: The same question but now considering that b
determines row slices instead of columns.
Two-dimensional (2D) arrays are indexed by two subscripts, one for the row and one for the column. Each element in the 2D array must by the same type, either a primitive type or object type.
You can have multiple datatypes; String, double, int, and other object types within a single element of the arrray, ie objArray[0] can contain as many different data types as you need. Using a 2-D array has absolutely no affect on the output, but how the data is allocated.
Define a vectorized function which takes a nested sequence of objects or numpy arrays as inputs and returns a single numpy array or a tuple of numpy arrays. The vectorized function evaluates pyfunc over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy.
ndarrays can be indexed using the standard Python x[obj] syntax, where x is the array and obj the selection. There are different kinds of indexing available depending on obj: basic indexing, advanced indexing and field access.
We can use broadcasting
to generate an appropriate mask and then masking
does the job -
In [150]: a
Out[150]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
In [151]: b
Out[151]: [4, 3, 1]
In [152]: mask = np.arange(len(a))[:,None] < b
In [153]: a.T[mask.T]
Out[153]: array([0, 3, 6, 9, 1, 4, 7, 2])
Another way to mask would be -
In [156]: a.T[np.greater.outer(b, np.arange(len(a)))]
Out[156]: array([0, 3, 6, 9, 1, 4, 7, 2])
Bonus : Slice per row
If we are required to slice per row based on chunk sizes, we would need to modify few things -
In [51]: a
Out[51]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
# slice lengths per row
In [52]: b
Out[52]: [4, 3, 1]
# Usual loop based solution :
In [53]: np.concatenate([a[i,:b_i] for i,b_i in enumerate(b)])
Out[53]: array([ 0, 1, 2, 3, 5, 6, 7, 10])
# Vectorized mask based solution :
In [54]: a[np.greater.outer(b, np.arange(a.shape[1]))]
Out[54]: array([ 0, 1, 2, 3, 5, 6, 7, 10])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With