I would like to be able to quickly instantiate a matrix where the first few (variable number of) cells in a row are 0, and the rest are ones.
Imagine we want a 3x4 matrix.
I have instantiated the matrix first as all ones:
ones = np.ones([4,3])
Then imagine we have an array that announces how many leading zeros there are:
arr = np.array([2,1,3,0]) # first row has 2 zeroes, second row 1 zero, etc
Required result:
array([[0, 0, 1],
[0, 1, 1],
[0, 0, 0],
[1, 1, 1]])
Obviously this can be done in the opposite way as well, but I'd consider the approach where 1 is a default value, and zeros would be replaced.
What would be the best way to avoid some silly loop?
Here's one way. n
is the number of columns in the result. The number of rows is determined by len(arr)
.
In [29]: n = 5
In [30]: arr = np.array([1, 2, 3, 0, 3])
In [31]: (np.arange(n) >= arr[:, np.newaxis]).astype(int)
Out[31]:
array([[0, 1, 1, 1, 1],
[0, 0, 1, 1, 1],
[0, 0, 0, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 1, 1]])
There are two parts to the explanation of how this works. First, how to create a row with m
zeros and n-m
ones? For that, we use np.arange
to create a row with values [0, 1, ..., n-1]`:
In [35]: n
Out[35]: 5
In [36]: np.arange(n)
Out[36]: array([0, 1, 2, 3, 4])
Next, compare that array to m
:
In [37]: m = 2
In [38]: np.arange(n) >= m
Out[38]: array([False, False, True, True, True], dtype=bool)
That gives an array of boolean values; the first m
values are False and the rest are True. By casting those values to integers, we get an array of 0s and 1s:
In [39]: (np.arange(n) >= m).astype(int)
Out[39]: array([0, 0, 1, 1, 1])
To perform this over an array of m
values (your arr
), we use broadcasting; this is the second key idea of the explanation.
Note what arr[:, np.newaxis]
gives:
In [40]: arr
Out[40]: array([1, 2, 3, 0, 3])
In [41]: arr[:, np.newaxis]
Out[41]:
array([[1],
[2],
[3],
[0],
[3]])
That is, arr[:, np.newaxis]
reshapes arr
into a 2-d array with shape (5, 1). (arr.reshape(-1, 1)
could have been used instead.) Now when we compare this to np.arange(n)
(a 1-d array with length n
), broadcasting kicks in:
In [42]: np.arange(n) >= arr[:, np.newaxis]
Out[42]:
array([[False, True, True, True, True],
[False, False, True, True, True],
[False, False, False, True, True],
[ True, True, True, True, True],
[False, False, False, True, True]], dtype=bool)
As @RogerFan points out in his comment, this is basically an outer product of the arguments, using the >=
operation.
A final cast to type int
gives the desired result:
In [43]: (np.arange(n) >= arr[:, np.newaxis]).astype(int)
Out[43]:
array([[0, 1, 1, 1, 1],
[0, 0, 1, 1, 1],
[0, 0, 0, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 0, 1, 1]])
Not as concise as I wanted (I was experimenting with mask_indices
), but this will also do the work:
>>> n = 3
>>> zeros = [2, 1, 3, 0]
>>> numpy.array([[0] * zeros[i] + [1]*(n - zeros[i]) for i in range(len(zeros))])
array([[0, 0, 1],
[0, 1, 1],
[0, 0, 0],
[1, 1, 1]])
>>>
Works very simple: concatenates multiplied required number of times, one-element lists [0]
and [1]
, creating the array row by row.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With