Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace subarrays in numpy

Given an array,

>>> n = 2
>>> a = numpy.array([[[1,1,1],[1,2,3],[1,3,4]]]*n)
>>> a
array([[[1, 1, 1],
        [1, 2, 3],
        [1, 3, 4]],

       [[1, 1, 1],
        [1, 2, 3],
        [1, 3, 4]]])

I know that it's possible to replace values in it succinctly like so,

>>> a[a==2] = 0
>>> a
array([[[1, 1, 1],
        [1, 0, 3],
        [1, 3, 4]],

       [[1, 1, 1],
        [1, 0, 3],
        [1, 3, 4]]])

Is it possible to do the same for an entire row (last axis) in the array? I know that a[a==[1,2,3]] = 11 will work and replace all the elements of the matching subarrays with 11, but I'd like to substitute a different subarray. My intuition tells me to write the following, but an error results,

>>> a[a==[1,2,3]] = [11,22,33]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: array is not broadcastable to correct shape

In summary, what I'd like to get is:

array([[[1, 1, 1],
        [11, 22, 33],
        [1, 3, 4]],

       [[1, 1, 1],
        [11, 22, 33],
        [1, 3, 4]]])

... and n of course is, in general, a lot larger than 2, and the other axes are also larger than 3, so I don't want to loop over them if I don't need to.


Update: The [1,2,3] (or whatever else I'm looking for) is not always at index 1. An example:

a = numpy.array([[[1,1,1],[1,2,3],[1,3,4]], [[1,2,3],[1,1,1],[1,3,4]]])
like image 650
Karol Avatar asked Jun 05 '12 08:06

Karol


People also ask

What is replace in Numpy?

replace() function of the char module in Numpy library. The replace() function is used to return a copy of the array of strings or the string, with all occurrences of the old substring replaced by the new substring.

How do I split a Numpy array into smaller arrays?

You can use numpy. split() function to split an array into more than one sub-arrays vertically (row-wise). There are two ways to split the array one is row-wise and the other is column-wise. By default, the array is split in row-wise (axis=0) .


2 Answers

You can achieve this with a much higher performance using np.all to check if all the columns have a True value for your comparison, then using the created mask to replace the values:

mask = np.all(a==[1,2,3], axis=2)
a[mask] = [11, 22, 23]

print(a)
#array([[[ 1,  1,  1],
#        [11, 22, 33],
#        [ 1,  3,  4]],
# 
#       [[ 1,  1,  1],
#        [11, 22, 33],
#        [ 1,  3,  4]]])
like image 178
Saullo G. P. Castro Avatar answered Sep 22 '22 16:09

Saullo G. P. Castro


You have to do something a little more complicated to acheive what you want.

You can't select slices of arrays as such, but you can select all the specific indexes you want.

So first you need to construct an array that represents the rows you wish to select. ie.

data = numpy.array([[1,2,3],[55,56,57],[1,2,3]])

to_select = numpy.array([1,2,3]*3).reshape(3,3) # three rows of [1,2,3]

selected_indices = data == to_select
# array([[ True,  True,  True],
#        [False, False, False],
#        [ True,  True,  True]], dtype=bool)

data = numpy.where(selected_indices, [4,5,6], data)
# array([[4, 5, 6],
#        [55, 56, 57],
#        [4, 5, 6]])

# done in one step, but perhaps not very clear as to its intent
data = numpy.where(data == numpy.array([1,2,3]*3).reshape(3,3), [4,5,6], data)

numpy.where works by selecting from the second argument if true and the third argument if false.

You can use where to select from 3 different types of data. The first is an array that has the same shape as selected_indices, the second is just a value on its own (like 2 or 7). The first is most complicated as can be of shape that can be broadcast into the same shape as selected_indices. In this case we provided [1,2,3] which can be stacked together to get an array with shape 3x3.

like image 28
Dunes Avatar answered Sep 21 '22 16:09

Dunes