Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create indicator matrix from two arrays in Python Numpy

Tags:

python

numpy

Given two vectors, I would like to create an indicator matrix. For example, given a=np.array([5,5,3,4,4,4]), and b=np.array([5,4,3]), the result should be

   5 4 3

5  1 0 0
5  1 0 0
3  0 0 1
4  0 1 0
4  0 1 0
4  0 1 0

What is the simplest way to achieve this?

like image 869
David Avatar asked Jul 12 '17 17:07

David


People also ask

How do I merge two NumPy arrays in Python?

How to concatenate NumPy arrays in Python? You can use the numpy. concatenate() function to concat, merge, or join a sequence of two or multiple arrays into a single NumPy array.

How do you add two NumPy matrices in Python?

add() function is used when we want to compute the addition of two array. It add arguments element-wise. If shape of two arrays are not same, that is arr1.


1 Answers

Using NumPy broadcasting -

(a[:,None]==b).astype(int)

Sample run -

In [104]: a
Out[104]: array([5, 5, 3, 4, 4, 4])

In [105]: b
Out[105]: array([5, 4, 3])

In [106]: (a[:,None]==b).astype(int)
Out[106]: 
array([[1, 0, 0],
       [1, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 1, 0],
       [0, 1, 0]])

If by simplest, you meant compact, here's a modified one to do the type conversion -

In [107]: (a[:,None]==b)*1
Out[107]: 
array([[1, 0, 0],
       [1, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 1, 0],
       [0, 1, 0]])

Explanation : None is an alias for numpy.newaxis, which is used to add a new axis (axis with length=1). So, in this case, with a[:,None] we get a 2D version of a. There are various other ways to have this 2D version, a.reshape(-1,1) being one of those. This allows for broadcasting when compared against 1D b, resulting in a 2D array of matches, a boolean array. The final step is conversion to an int array.

Step-by-step run -

In [141]: a
Out[141]: array([5, 5, 3, 4, 4, 4])

In [142]: b
Out[142]: array([5, 4, 3])

In [143]: a[:,None]
Out[143]: 
array([[5],
       [5],
       [3],
       [4],
       [4],
       [4]])

In [144]: a[:,None] == b
Out[144]: 
array([[ True, False, False],
       [ True, False, False],
       [False, False,  True],
       [False,  True, False],
       [False,  True, False],
       [False,  True, False]], dtype=bool)

In [145]: (a[:,None] == b).astype(int)
Out[145]: 
array([[1, 0, 0],
       [1, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 1, 0],
       [0, 1, 0]])
like image 73
Divakar Avatar answered Oct 02 '22 08:10

Divakar