Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Instantiate a matrix with x zeros and the rest ones

I would like to be able to quickly instantiate a matrix where the first few (variable number of) cells in a row are 0, and the rest are ones.

Imagine we want a 3x4 matrix.

I have instantiated the matrix first as all ones:

ones = np.ones([4,3])

Then imagine we have an array that announces how many leading zeros there are:

arr = np.array([2,1,3,0]) # first row has 2 zeroes, second row 1 zero, etc

Required result:

array([[0, 0, 1],
       [0, 1, 1],
       [0, 0, 0],
       [1, 1, 1]])

Obviously this can be done in the opposite way as well, but I'd consider the approach where 1 is a default value, and zeros would be replaced.

What would be the best way to avoid some silly loop?

like image 649
PascalVKooten Avatar asked Mar 19 '23 08:03

PascalVKooten


2 Answers

Here's one way. n is the number of columns in the result. The number of rows is determined by len(arr).

In [29]: n = 5

In [30]: arr = np.array([1, 2, 3, 0, 3])

In [31]: (np.arange(n) >= arr[:, np.newaxis]).astype(int)
Out[31]: 
array([[0, 1, 1, 1, 1],
       [0, 0, 1, 1, 1],
       [0, 0, 0, 1, 1],
       [1, 1, 1, 1, 1],
       [0, 0, 0, 1, 1]])

There are two parts to the explanation of how this works. First, how to create a row with m zeros and n-m ones? For that, we use np.arange to create a row with values [0, 1, ..., n-1]`:

In [35]: n
Out[35]: 5

In [36]: np.arange(n)
Out[36]: array([0, 1, 2, 3, 4])

Next, compare that array to m:

In [37]: m = 2

In [38]: np.arange(n) >= m
Out[38]: array([False, False,  True,  True,  True], dtype=bool)

That gives an array of boolean values; the first m values are False and the rest are True. By casting those values to integers, we get an array of 0s and 1s:

In [39]: (np.arange(n) >= m).astype(int)
Out[39]: array([0, 0, 1, 1, 1])

To perform this over an array of m values (your arr), we use broadcasting; this is the second key idea of the explanation.

Note what arr[:, np.newaxis] gives:

In [40]: arr
Out[40]: array([1, 2, 3, 0, 3])

In [41]: arr[:, np.newaxis]
Out[41]: 
array([[1],
       [2],
       [3],
       [0],
       [3]])

That is, arr[:, np.newaxis] reshapes arr into a 2-d array with shape (5, 1). (arr.reshape(-1, 1) could have been used instead.) Now when we compare this to np.arange(n) (a 1-d array with length n), broadcasting kicks in:

In [42]: np.arange(n) >= arr[:, np.newaxis]
Out[42]: 
array([[False,  True,  True,  True,  True],
       [False, False,  True,  True,  True],
       [False, False, False,  True,  True],
       [ True,  True,  True,  True,  True],
       [False, False, False,  True,  True]], dtype=bool)

As @RogerFan points out in his comment, this is basically an outer product of the arguments, using the >= operation.

A final cast to type int gives the desired result:

In [43]: (np.arange(n) >= arr[:, np.newaxis]).astype(int)
Out[43]: 
array([[0, 1, 1, 1, 1],
       [0, 0, 1, 1, 1],
       [0, 0, 0, 1, 1],
       [1, 1, 1, 1, 1],
       [0, 0, 0, 1, 1]])
like image 169
Warren Weckesser Avatar answered Apr 01 '23 03:04

Warren Weckesser


Not as concise as I wanted (I was experimenting with mask_indices), but this will also do the work:

>>> n = 3
>>> zeros = [2, 1, 3, 0]
>>> numpy.array([[0] * zeros[i] + [1]*(n - zeros[i]) for i in range(len(zeros))])
array([[0, 0, 1],
       [0, 1, 1],
       [0, 0, 0],
       [1, 1, 1]])
>>>

Works very simple: concatenates multiplied required number of times, one-element lists [0] and [1], creating the array row by row.

like image 44
BartoszKP Avatar answered Apr 01 '23 01:04

BartoszKP